Everyone is talking about AI right now. ChatGPT, Claude, Gemini — it feels like it all just appeared out of nowhere in 2022. But that's not the real story.
AI has been in the making for over 70 years. There were brilliant ideas, terrible failures, long winters, and quiet breakthroughs that most people never heard about. And then — boom — suddenly everyone's using it to write emails and debug code.
This post is the full story. Where AI started, how it almost died twice, and how we got to the point where a machine can hold a conversation, write poetry, and help you build a business.
The Question That Started Everything
It starts with a man named Alan Turing. British mathematician. World War II codebreaker. And the first person to seriously ask: can machines think?
In 1950, Turing published a paper called Computing Machinery and Intelligence. He proposed what he called the Imitation Game — later known as the Turing Test. The idea was simple: if a human can't tell whether they're talking to a machine or another human, the machine is intelligent. This one paper planted the seed for everything that came after.
"Can machines think?" — Alan Turing, 1950. A question so simple it took 70 years to start answering.
Around the same time, scientists were figuring out how the brain works at a biological level. Warren McCulloch and Walter Pitts proposed a mathematical model of a neuron in 1943 — basically a circuit that could fire signals. This became the foundation of what we now call neural networks. Nobody knew it then, but this idea would eventually power ChatGPT and Claude.
The Year AI Got Its Name
In the summer of 1956, a group of researchers gathered at Dartmouth College in the US. Among them: John McCarthy, Marvin Minsky, Claude Shannon, and others. McCarthy proposed calling this field "artificial intelligence." The name stuck.
The mood was optimistic — almost naively so. People genuinely believed that within a generation, machines would be able to do anything a human could do. McCarthy himself predicted human-level AI within 10 years. He was off by about 60.
Still, the 1950s and early 60s saw real progress: programs that could play chess, solve algebra problems, prove mathematical theorems. Simple stuff by today's standards — but genuinely impressive for machines made of vacuum tubes.
The First AI Winter
Then reality hit. Hard.
All those early AI programs had one big problem: they could only work in very limited, structured environments. The real world is messy. Language is ambiguous. Problems don't come pre-formatted.
In 1969, Minsky and Seymour Papert mathematically showed the limits of simple neural networks. Funding dried up. Governments lost interest. Labs shut down. This period — roughly 1974 to 1980 — is called the First AI Winter. The field almost died.
Expert Systems and the Second Hype Cycle
AI came back in the 1980s, but differently. Instead of trying to make machines think like humans, researchers built "expert systems" — programs that encoded human expertise into rules. If the patient has a fever AND a rash, THEN diagnose X.
These actually worked, for narrow problems. Companies started investing again. Japan launched a massive "Fifth Generation Computer" project. Then the market collapsed again — the Second AI Winter. Expert systems were fragile, expensive to maintain, and couldn't learn.
The Quiet Comeback
While the AI Winter was happening publicly, something quiet was happening in the background. Researchers kept working on neural networks — specifically on a technique called backpropagation, which let networks learn from their mistakes.
Geoffrey Hinton, Yann LeCun, and Yoshua Bengio — the three people later called the "Godfathers of AI" — were all working on this during the 90s. Not getting much funding. Not getting much attention. Just quietly figuring out how to make neural networks actually learn.
In 1997, IBM's Deep Blue beat Garry Kasparov at chess. The world's best chess player, beaten by a machine. It wasn't really "AI" in the modern sense — it was brute-force calculation — but it showed that machines could outperform humans at something previously thought impossible.
Deep Learning Changes Everything
This is the inflection point. The moment that actually created modern AI.
In 2006, Hinton published a paper showing how to train deep neural networks — networks with many layers — effectively. Then came ImageNet 2012. A team from Hinton's lab entered a computer vision competition. Their model — AlexNet — destroyed the competition, cutting the error rate nearly in half.
The AI research world stopped and took notice. Deep learning went from "fringe idea" to "everyone needs to understand this" almost overnight.
Why deep learning was different
Previous AI needed humans to define the rules. Deep learning figures out the rules itself — by looking at millions of examples and adjusting its internal weights until it gets good at a task. This is the core shift that made modern AI possible.
The Paper That Built ChatGPT
In 2017, a team of eight researchers at Google published a paper titled "Attention Is All You Need." It introduced the Transformer architecture.
Before Transformers, AI models processed text word by word, in sequence. By the end of a long sentence, the model had almost forgotten the beginning. Transformers solved this with "attention" — a mechanism that lets the model look at all words simultaneously and understand which ones are most relevant to each other.
This paper is directly responsible for GPT, ChatGPT, Claude, Gemini, and every major language model that exists today. The "T" in GPT literally stands for Transformer. That's how foundational this one paper was.
OpenAI Builds the GPT Series
OpenAI was founded in 2015 by Sam Altman, Elon Musk, Greg Brockman, and others — originally as a non-profit. They started building language models using the Transformer architecture.
In 2019 came GPT-2, which was so capable that OpenAI initially refused to fully release it — worried about misuse. The AI world laughed a little at this. Then saw the outputs. Stopped laughing.
GPT-3 arrived in 2020 with 175 billion parameters — a scale nobody had attempted before. It could write essays, answer questions, generate code, and carry conversations. Not perfectly, but well enough to genuinely surprise people. This was the model that made the tech world say: something is really happening here.
ChatGPT Hits the World
On November 30, 2022, OpenAI launched ChatGPT. It used a technique called Reinforcement Learning from Human Feedback (RLHF) — humans rated the model's responses, and the model learned to produce responses humans preferred. This made it not just smart but helpful, safe, and conversational in a way previous models weren't.
It hit one million users in five days. One hundred million in two months. Instagram took two and a half years to reach 100 million users. ChatGPT did it in eight weeks.
This wasn't just a viral moment. It was a turning point. Suddenly every company, every founder, every government was talking about AI. People who had never heard of GPT were having conversations with a machine and genuinely not believing it.
How Anthropic and Claude Started
The story of Claude actually begins inside OpenAI.
Dario Amodei was VP of Research at OpenAI. His sister Daniela Amodei was VP of Operations. Along with several senior researchers, they had growing concerns about the direction of AI development — specifically around safety. They believed AI was advancing faster than the safety research needed to understand and control it.
In 2021, they left OpenAI and founded Anthropic. The mission: AI safety research, and building AI systems that are safe, honest, and beneficial. Not just capable.
Anthropic's first major contribution wasn't a product — it was a research technique called Constitutional AI. Instead of having humans rate every response, you give the model a set of principles — a constitution — and have it evaluate its own outputs against those principles. More scalable. More transparent.
Claude launched publicly in early 2023. Named after Claude Shannon — the mathematician who founded information theory. From the beginning, Claude was designed differently: more focused on being honest about uncertainty, refusing harmful requests more consistently, and explaining its reasoning.
The Race
After ChatGPT, the race was on. Google launched Bard (later Gemini). Meta released Llama as open source. Microsoft invested $10 billion into OpenAI. Apple announced Apple Intelligence. Dozens of AI companies raised billions.
Claude kept getting better — Claude 2, Claude 3, Claude 3.5, and onwards — each version sharper, longer context, more useful for real work. By 2025, Claude had become the go-to model for many developers and businesses that wanted something capable but also reliably careful.
We're not at the end of this story. We're maybe at chapter three.
Want AI working inside your business — not just beside it?
At Manas AI we build custom AI agents, automation workflows, and MCP servers for startups and SMBs. No generic tools. Stuff that actually fits how your business works.
Talk to us →


