The State of AI

Access the report here: stateof.ai

Transcript

All right, let’s dive in. We’re tackling the state of AI report 2024 this time around. Seventh year they put this out. Nathan Benaish and Airstreet Capital, they really have their fingers on the pulse of AI. Talk about a must-read if you want to understand what’s really happening in the world of AI.

No kidding. Remember last year, everyone was buzzing about OpenAI. GPT-4 seemed impossible to beat for a while there.

Right. Well, this year’s report shows that the playing field’s evening out. Google’s got their models, Anthropic too. Even Meta’s getting in on the action. And their benchmarks are nothing to sneeze at. Claude, 3.5 Sonnet, Gemini 1.5. They’re going head to head with OpenAI now.

And this is a big one, the rise of open models. It’s a real turning point. Especially Meta’s Llama 3.

Right. For the first time, you’ve got an open model that’s right up there with the big proprietary players in terms of performance.

It’s interesting, though, because when we talk about open, it’s not always as straightforward as it seems. The report spends a lot of time on this.

Yeah, there’s a lot of nuance. Open means different things to different people.

Right. Exactly. Some projects are very transparent with their weights, data, licensing, the whole nine yards. Others, not so much. It’s something to keep in mind as we see more and more of these open-source models popping up. We have to be critical about what open really means in practice.

It’s almost like the Wild West out there. A lot of potential, but still figuring out the rules of the game.

Exactly. And that ties into another big issue the report digs into. Benchmarking. How do we actually measure progress in AI? There are some real challenges there.

Right. Like dataset contamination, where test data might be leaking into the training sets.

Right. And that can make results look better than they actually are. The report even points to a study that found errors in the MMLU benchmark, one of the most popular ones used to evaluate language models. So we could be getting a skewed view of how much progress is being made, either overestimating or maybe even underestimating what these models can actually do.

Exactly. And that’s why the report stresses the need for better, more transparent ways to evaluate these AI systems. If we’re going to compare them, we need to be playing by the same rules, right?

Makes sense. And speaking of different approaches, remember neurosymbolic systems. The report highlights how they’re making a comeback, combining deep learning with good old-fashioned symbolic reasoning.

Yeah, and it’s showing real promise. The report talks about Alpha Geometry, a project from Google DeepMind. It’s achieving near-human performance on some super complex geometry problems, like the kind they use in math Olympiads. So it seems like these hybrid models might be able to tackle problems that traditional deep learning has struggled with, problems that need both raw processing power and the ability to reason abstractly.

Totally. And while we’re talking about improving AI, we can’t forget about efficiency because those powerful models often come with a hefty computational cost.

Right. So it’s not just about making AI smarter, but also making it leaner and more efficient.

Right. And that’s where things like model shrinking and distillation come in. Techniques for slimming down those massive models without sacrificing performance.

That sounds crucial if we want to run AI on everyday devices like our phones. Imagine personalized AI that can adapt to your needs on the fly without needing a giant data center to run.

And the report points to some exciting developments in that area, like representation fine-tuning or ReFT. Instead of retraining the entire model, it tweaks how it processes information on the device itself.

Yeah. Like fine-tuning the settings on your camera instead of buying a whole new lens.

Exactly. And speaking of data, what about all this talk about synthetic data for training? It’s promising, right? Potentially less biased than real-world datasets. But there’s also that risk of model collapse. Where errors in the synthetic data get amplified during training.

Garbage in, garbage out, as they say.

Exactly. And that’s why the report emphasizes the importance of not just the quantity of data, but the quality.

Absolutely. They highlight a project from Hugging Face where they built this massive dataset for training language models. 15 trillion tokens. But the key was they were really picky about the data they used, curated it carefully. Quality over quantity every time.

And this focus on context is crucial, especially for things like retrieval augmented generation or RAG, where the AI is pulling in outside information to answer your query.

Right. It’s not just about finding keywords anymore, but understanding how all that information fits together. And the report highlights some cool work on contextual embeddings. Trying to teach AI to think more like that librarian who helps you track down the perfect book, not just the one with the right words in the title.

Exactly. And while we’re talking about different players in the AI world, the report also dives into the rise of Chinese AI. Even with the U.S. sanctions, labs like DeepSeq or OnePoint AI, they’re making waves. And some of their open-source projects are becoming really popular, like DeepSeq’s Coder model. It’s a good reminder that this is a global race.

Absolutely. And speaking of unexpected advancements, who would have guessed that diffusion models, which blew everyone away with text-to-image generation, would end up being used in robotics?

Sounds like they’re using them to generate complex action sequences for robots, creating a kind of shared representation of the robot’s perception and its possible actions.

It’s amazing how breakthroughs in one area of AI can lead to these unexpected advances in other fields. That cross-pollination is so important.

And while we’re on the topic of robots, remember those robot dogs everyone was obsessed with a while back?

Oh yeah, the Boston Dynamics bot.

That’s the one. Well, it’s back in a big way, and this time it’s not just about looking cool. Researchers are using it for all sorts of cutting-edge work. A team from Stanford and Columbia is working on improving its grasping and manipulation skills. Instead of controlling each joint individually, they’re focusing on the overall movement of the gripper.

That’s fascinating. Makes it easier to transfer those skills from, say, a stationary robotic arm to a mobile robot-like Spot.

Exactly. And even the Apple Vision Pro, which hasn’t really taken off as a consumer product, is finding a home in robotics research.

Yeah. The report mentions how its sensors and spatial awareness are perfect for teleoperation, like controlling robots remotely with incredible precision.

It just goes to show you never know where technology will end up having the biggest impact.

Speaking of impact, the quest for Artificial General Intelligence, AGI, it’s still a driving force. That dream of creating AI that can truly rival human intelligence across a wide range of tasks.

Right. And the report highlights the ARC Prize, a million-dollar fund aimed at accelerating progress towards AGI. It’s a fascinating goal, but also a bit of a moving target, because what does it even mean to achieve AGI?

Our understanding of intelligence itself is constantly evolving.

It’s a good point. It’s a question that philosophers and scientists have been grappling with for centuries.

But while we’re pondering the nature of intelligence, the report reminds us that current AI systems still face some very real limitations.

Yeah, like LLMs, as impressive as they are, they still struggle with things like planning and simulation, especially when it comes to generalizing beyond the data they’ve been trained on.

It’s like they’re amazing at following instructions, but not so great at coming up with their own plans or understanding the consequences of their actions.

So we’re still a ways off from those truly autonomous thinking machines we see in sci-fi movies.

For sure. But researchers are exploring all sorts of interesting avenues to bridge that gap, like iterative prompting, where they give the model feedback and let it refine its responses, and integrating LLMs with methods like Monte Carlo tree search for better decision-making.

It’s all about pushing the boundaries, seeing what’s possible.

And that’s what makes this field so exciting. AI agents now—that’s something that sounds straight out of science fiction, but this report makes it clear they’re not just a fantasy anymore.

No, they’re becoming very real. Though building AI agents that can actually function in the real world, that’s a whole other story. The report goes pretty deep on the challenges there.

One of the biggest hurdles has to be dealing with, well, the unpredictability of it all. Real life throws curveballs that no algorithm can predict.

Absolutely. It’s one thing to train an AI in a controlled environment, a game for example, with clear rules.

Yeah. But the real world, that’s a whole different ballgame. You’re constantly having to adjust, adapt, think on your feet.

Exactly. And that’s why researchers are so focused on combining things like LLMs with reinforcement learning. You need that high-level reasoning of the LLMs, but also the ability to learn from experience that RL brings to the table.

So it’s like the LLM provides the strategy, the big-picture plan, and then the RL is the one figuring out the tactics, making those real-time adjustments based on what’s happening around it.

That’s a great way to put it. And it’s showing real promise.

Yeah. The report talks about Digirel, a system specifically designed for training agents to operate on Android devices. And apparently, they’re seeing some impressive results.

Yeah, they’re talking about significant improvements in task success rates on real-world Android tasks.

But AI agents, they’re not just for our phones, right? We’re also talking about robotics.

Absolutely. Robotics is another field where these agents have huge potential. Imagine robots that can not just follow pre-programmed instructions but actually learn and adapt to their environment, manipulate objects, solve problems. We’re talking about robots that can understand a task like

“clean up this messy kitchen” and actually do it right. Not just those repetitive tasks in a controlled factory setting.

Right. And that’s where things like foundation models come into play. They’re being used to create these incredibly realistic simulated environments where these AI agents can learn and practice these complex skills.

They can make mistakes, learn from them without any real-world consequences.

The report even talks about a system called Genie that can build these virtual worlds by analyzing video game footage.

It’s wild, right? They’re using the same technology that powers our entertainment to train these AI agents for the real world. It’s not just about making the simulations look real. It’s about injecting them with real-world physics, real-world challenges.

The report mentioned something about affordance information, adding that into the simulations. What exactly is that?

So think about how you, as a human, just intuitively know how to interact with the world. You know a cup is for holding liquids, a chair is for sitting on. It’s like our common-sense understanding of how things work.

Right. And affordance information is basically trying to teach that common sense to robots, helping them understand the properties of objects and how they can be used. It’s like giving them a crash course in being human, at least in terms of interacting with the physical world.

Exactly. And it turns out even things like chain-of-thought reasoning, which has been a big focus in language models, that can be applied to robots too.

So instead of just reacting to their surroundings, these robots are actually thinking through their actions step by step.

That’s the idea. Considering different possibilities, making more deliberate choices, it’s a big step towards robots that can reason and problem-solve more like we do.

OK, now we’re getting into some seriously mind-blowing stuff. The report also dives into this idea of foundation models for the mind. Are we talking about AI that can read our thoughts now?

Well, not quite reading our thoughts, but definitely getting closer to understanding how the human brain works. And they’re using AI to do it. So these models are being trained on massive datasets of brain activity, fMRI recordings, things like that.

That’s right. And the insights they’re gleaning from that data are amazing. The report talks about BrainLM, a foundation model trained on thousands of hours of fMRI recordings. And this model can predict things like age, personality traits, even mental health conditions just from brain scans.

That’s incredible. And a little bit unnerving, right? It really highlights the power of these foundation models, but also the potential ethical implications.

But it gets even wilder.

Okay, I’m ready. Hit me with it.

There’s a generative model called Mind’s Eye 2. It can actually reconstruct images that someone is seeing just by analyzing their brain activity.

Hold on. You’re saying they can show someone a picture, record their brainwaves, and then AI can recreate that image. That’s straight out of science fiction.

It really is. And it’s not perfect, of course, but it’s getting more and more accurate all the time.

That’s both amazing and terrifying at the same time.

But while we’re trying to wrap our heads around that, let’s talk about the bigger picture for a second. The report mentions a noticeable shift in how people are thinking about AI, like moving from this emphasis on safety to a more accelerationist mindset. It’s subtle, but it’s definitely there.

There’s a growing sense of urgency, this feeling that we need to be pushing the boundaries of AI as fast as possible, not just for the sake of progress, but because of the competition. The race is on and no one wants to fall behind.

Exactly. But of course, that raises questions, right? Are we moving too fast? Are we considering the potential risks? It’s like that classic dilemma, balancing progress with responsibility.

AI has the potential to solve some of humanity’s biggest challenges, but we also need to make sure we don’t create new ones in the process.

And one of those potential challenges the report highlights is the impact of AI on the power grid. These systems are incredibly energy-hungry.

Right. It’s not just about computational power anymore. It’s about having enough electricity to keep all these massive data centers running.

Exactly. And that’s why there’s so much research focused on making AI training more efficient, reducing that energy footprint.

One example is Diloco, an optimization algorithm from Google DeepMind.

I read about that. It’s about reducing the amount of data that needs to be exchanged during training, right? So you can train these massive models on more distributed networks.

Exactly. Instead of relying on these giant centralized data centers, which use a ton of energy, you can spread out the workload. It’s like finding ways to train these AI behemoths on a diet, making them more energy efficient without sacrificing performance.

Very important. But it’s not just about efficiency. It’s also about finding new applications for this technology.

One area the report talks about is synthetic data in medicine.

Oh, yeah. That has huge potential. Think about medical imaging, diagnostics. Right now we rely on huge datasets of real patient data to train those models, which is expensive, time-consuming, and raises all sorts of privacy concerns. But with synthetic data, you could create those datasets without using any real patient information.

Precisely. And the report highlights a project where researchers used AI to generate synthetic chest X-rays that were so realistic they fooled experienced radiologists.

That’s incredible. It really shows the potential of synthetic data to revolutionize healthcare.

But of course, as with any powerful technology, there are always concerns. One that comes to mind is automation. We’ve already seen AI disrupt certain industries, replace jobs. What does the future hold as these systems become even more capable?

It’s a question a lot of people are asking, and it’s not an easy one to answer. The report talks about the challenges of traditional approaches to enterprise automation, like robotic process automation.

Those haven’t really lived up to the hype, have they?

Not quite. They tend to be brittle, expensive, difficult to adapt to new situations. But the report does point to a new wave of automation powered by these foundation models. So the same technology that’s driving things like ChatGPT, that’s now being applied to business processes.

Right. And they’re seeing some impressive results. The report mentions FlowMind, a system developed by JP Morgan. It uses LLMs to generate these executable workflows for financial tasks. And it apparently achieves incredible accuracy in understanding and automating these complex processes.

So it’s like having an army of AI assistants all working together seamlessly behind the scenes to handle these complicated tasks.

That’s the idea. But of course, increased efficiency often means fewer jobs for humans. So how do we make sure the benefits of this AI-powered automation are shared, that workers aren’t left behind?

That’s the million-dollar question, isn’t it? It’s going to require a multi-pronged approach. Education, retraining, upskilling. And some honest conversations about the future of work in this rapidly changing landscape.

And those conversations need to happen now, not after it’s too late.

But speaking of the future, let’s turn our attention back to the hardware that’s powering it all. NVIDIA might be the dominant player right now, but the report makes it clear that the competition is heating up.

It’s hard to keep up, you know? It seems like every day there’s some new headline about AI. New breakthrough, new application, new company you’ve never even heard of. It’s a lot. And this report, even as comprehensive as it is, it’s really just a snapshot in time. Things are changing so fast.

That’s what makes it so fascinating though, right? We’re watching a technological revolution unfold in real time.

Exactly. It’s an incredible time to be paying attention to this field.

So where do we even go from here? If you had to distill it down, what are the key takeaways for someone trying to navigate this crazy world of AI?

Well, I think the most important thing is don’t believe the hype. There’s a lot of it out there. It’s easy to get caught up in the excitement, the fear, all of it.

Easier said than done, right? Especially when you see those headlines saying AI is either going to save the world or destroy it.

Right. At the end of the day, it’s important to remember AI is a tool, a very powerful tool, yes, but a tool nonetheless. And like any tool, it can be used for good or bad. It all depends on who’s using it and what they’re using it for.

That’s why it’s so crucial to be developing and deploying AI responsibly, thinking about safety, fairness, transparency, all of that.

And that requires understanding the technology, right? We can’t just leave it up to the engineers and call it a day. This affects all of us.

Absolutely. And that’s where resources like this report can be really valuable. It’s a great starting point for getting up to speed on the latest trends, the challenges, the big questions we should be asking.

But even beyond reading reports, there are so many ways to engage with AI these days. Experiment with the tools, try things out, learn some basic coding even.

Exactly. There’s no better way to understand something than to dive in and get your hands dirty.

It’s like learning a new language, right? The more fluent you become, the more you can engage with that world, understand different perspectives, contribute to the conversation.

I love that analogy. And it highlights something really important. The future of AI isn’t predetermined. It’s not some fixed path we’re on. It’s a story that’s still being written. And we all have a role to play in shaping how that story unfolds.

Exactly. So what can our listeners do today to become more informed, more empowered participants in this AI-powered future?

That’s the million-dollar question. Where do we even begin?

Well, start by asking questions. Don’t take anything for granted. Challenge assumptions. Think critically about the information you’re consuming.

Like that Einstein quote, right? The important thing is not to stop questioning.

Exactly. Curiosity is key. And don’t just

rely on one source of information. Read widely. Listen to podcasts. Talk to experts. Attend conferences. The more perspectives you expose yourself to, the better.

It’s about becoming a discerning consumer of information, learning to separate the hype from the reality, and ultimately forming your own informed opinions.

Absolutely. And don’t be afraid to experiment. Try things out. Even if it’s just playing around with ChatGPT or Dall-E or trying to build a simple chatbot yourself, you’ll learn a lot more by doing than by just reading about it.

It’s like anything else, right? You can read the manual all you want, but you’ll never really learn to ride a bike until you actually get on one and give it a try.

Exactly. And who knows, you might even discover a passion for AI you never knew you had.

So as we wrap up this deep dive into the state of AI report 2024, let’s leave our listeners with one final thought. If AI can already create stunning works of art, write compelling stories, even help us understand the mysteries of the human brain, what seemingly impossible task might it conquer next?

That’s a question for all of us to ponder. The future of AI is full of possibilities. It’s up to all of us to ensure those possibilities lead to a brighter, more equitable, and awe-inspiring future for everyone.

And that’s a wrap. We’ll see you next time for another deep dive into the world of AI.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *