Search Immortality Topics:

Page 11234..1020..»


Google demos out AI video generator Veo with the help of Donald Glover – Mashable

Posted: May 15, 2024 at 2:36 am

Google, with the help of creative renaissance man Donald Glover, has demoed an AI video generator to compete with OpenAI's Sora. The model is called Veo, and while no clear launch date or rollout plan has been announced, the demo does appear to show a Sora-like product, apparently capable of generating high-quality, convincing video.

What's "cool" about VEO? "You can make a mistake faster," Glover said in a video shown during Google's I/O 2024 livestream. "That's all you really want at the end of the day at least in art is just to make mistakes fast."

Credit: Mashable screenshot from a Google promo

Speaking onstage in Hawaii at Google I/O, Google Deepmind CEO Demis Hassabis said, "Veo creates high quality 1080p videos from text image and video prompts." This makes Veo the same type of tool, with the same resolution as Sora on its highest setting. A slider shown in the demo shows a Veo video length being stretched out to a little over one minute, also the approximate length of a Sora video.

Since Veo and Sora are both unreleased products, there's very little use trying to compare them in detail at this point. However, according to Hassabis, the interface will allow Veo users to "further edit your videos using additional prompts." This would be a function that Sora doesn't currently have according to creators who have been given access.

Mashable Light Speed

What was Veo trained on? That's not currently clear. About a month ago, YouTube CEO Neal Mohan told Bloomberg that if OpenAI used YouTube videos to train Sora, that would be a "clear violation" of the YouTube terms of service. However, YouTube's parent company Alphabet also owns Google, which made Veo. Mohan strongly implied in that Bloomberg interview that YouTube does feed content to Google's AI models, but only, he claims, when users sign off on it.

What we do know about the creation of Veo is that, according to Hassabis, this model is the culmination of Google and Deepmind's many similar projects, including Deepmind's Generative Query Network (GQN) research published back in 2018, last year's VideoPoet,Google's rudimentary video generator Phenaki, and Google's Lumiere, which was demoed earlier this year.

Glover's specific AI-enabled filmmaking project hasn't been announced. According to the video at I/O, Glover says he's "been interested in AI for a couple of years now," and that he reached out to Google and apparently not the other way around. "We got in contact with some of the people at Google and they had been working on something of their own, so we're all meeting," Glover says in Google's Veo demo video.

There's currently no way for the general public to try Veo, but there is a waitlist signup page.

Visit link:

Google demos out AI video generator Veo with the help of Donald Glover - Mashable

Recommendation and review posted by G. Smith

Project Astra is the future of AI at Google – The Verge

Posted: May 15, 2024 at 2:36 am

Ive had this vision in my mind for quite a while, says Demis Hassabis, the head of Google DeepMind and the leader of Googles AI efforts. Hassabis has been thinking about and working on AI for decades, but four or five years ago, something really crystallized. One day soon, he realized, We would have this universal assistant. Its multimodal, its with you all the time. Call it the Star Trek Communicator; call it the voice from Her; call it whatever you want. Its that helper, Hassabis continues, thats just useful. You get used to it being there whenever you need it.

At Google I/O, the companys annual developer conference, Hassabis showed off a very early version of what he hopes will become that universal assistant. Google calls it Project Astra, and its a real-time, multimodal AI assistant that can see the world, knows what things are and where you left them, and can answer questions or help you do almost anything. In an incredibly impressive demo video that Hassabis swears is not faked or doctored in any way, an Astra user in Googles London office asks the system to identify a part of a speaker, find their missing glasses, review code, and more. It all works practically in real time and in a very conversational way.

Astra is just one of many Gemini announcements at this years I/O. Theres a new model, called Gemini 1.5 Flash, designed to be faster for common tasks like summarization and captioning. Another new model, called Veo, can generate video from a text prompt. Gemini Nano, the model designed to be used locally on devices like your phone, is supposedly faster than ever as well. The context window for Gemini Pro, which refers to how much information the model can consider in a given query, is doubling to 2 million tokens, and Google says the model is better at following instructions than ever. Googles making fast progress both on the models themselves and on getting them in front of users.

Going forward, Hassabis says, the story of AI will be less about the models themselves and all about what they can do for you. And that story is all about agents: bots that dont just talk with you but actually accomplish stuff on your behalf. Our history in agents is longer than our generalized model work, he says, pointing to the game-playing AlphaGo system from nearly a decade ago. Some of those agents, he imagines, will be ultra-simple tools for getting things done, while others will be more like collaborators and companions. I think it may even be down to personal preference at some point, he says, and understanding your context.

Astra, Hassabis says, is much closer than previous products to the way a true real-time AI assistant ought to work. When Gemini 1.5 Pro, the latest version of Googles mainstream large language model, was ready, Hassabis says he knew the underlying tech was good enough for something like Astra to begin to work well. But the model is only part of the product. We had components of this six months ago, he says, but one of the issues was just speed and latency. Without that, the usability isnt quite there. So, for six months, speeding up the system has been one of the teams most important jobs. That meant improving the model but also optimizing the rest of the infrastructure to work well and at scale. Luckily, Hassabis says with a laugh, Thats something Google does very well!

A lot of Googles AI announcements at I/O are about giving you more and easier ways to use Gemini. A new product called Gemini Live is a voice-only assistant that lets you have easy back-and-forth conversations with the model, interrupting it when it gets long-winded or calling back to earlier parts of the conversation. A new feature in Google Lens allows you to search the web by shooting and narrating a video. A lot of this is enabled by Geminis large context window, which means it can access a huge amount of information at a time, and Hassabis says its crucial to making it feel normal and natural to interact with your assistant.

Know who agrees with that assessment, by the way? OpenAI, which has been talking about AI agents for a while now. In fact, the company demoed a product strikingly similar to Gemini Live barely an hour after Hassabis and I chatted. The two companies are increasingly fighting for the same territory and seem to share a vision for how AI might change your life and how you might use it over time.

How exactly will those assistants work, and how will you use them? Nobody knows for sure, not even Hassabis. One thing Google is focused on right now is trip planning it built a new tool for using Gemini to build an itinerary for your vacation that you can then edit in tandem with the assistant. There will eventually be many more features like that. Hassabis says hes bullish on phones and glasses as key devices for these agents but also says there is probably room for some exciting form factors. Astra is still in an early prototype phase and only represents one way you might want to interact with a system like Gemini. The DeepMind team is still researching how best to bring multimodal models together and how to balance ultra-huge general models with smaller and more focused ones.

Were still very much in the speeds and feeds era of AI, in which every incremental model matters and we obsess over parameter sizes. But pretty quickly, at least according to Hassabis, were going to start asking different questions about AI. Better questions. Questions about what these assistants can do, how they do it, and how they can make our lives better. Because the tech is a long way from perfect, but its getting better really fast.

Go here to see the original:

Project Astra is the future of AI at Google - The Verge

Recommendation and review posted by G. Smith

Senators to seek billions for AI research, push for regulation – The Washington Post

Posted: May 15, 2024 at 2:36 am

A bipartisan group of senators, including Majority Leader Charles E. Schumer, will unveil a long-awaited road map this week for regulating artificial intelligence, directing Congress to infuse billions of dollars into research and development of the technology while addressing its potential harms.

The sprawling directive comes almost a year after Schumer (D-N.Y.) called for an all hands on deck push to regulate AI, saying Congress needed to accomplish years of work in months.

Read more here:

Senators to seek billions for AI research, push for regulation - The Washington Post

Recommendation and review posted by G. Smith

The SF Bay Area Has Become The Undisputed Leader In AI Tech And Funding Dollars – Crunchbase News

Posted: May 15, 2024 at 2:36 am

Theres been much talk of a resurgent San Francisco with the new technology wave of artificial intelligence washing over the software world. Indeed, Crunchbase funding data as well as interviews with startup investors and real estate industry professionals show the San Francisco Bay Area has become the undisputed epicenter of artificial intelligence.

Last year, more than 50% of all global venture funding for AI-related startups went to companies headquartered in the Bay Area, Crunchbase data shows, as a cluster of talent congregates in the region.

Beginning in Q1 2023, when OpenAIs ChatGPT reached 100 million users within months of launching, the amount raised by Bay Area startups in AI started trending up. That accelerated with OpenAI raising $10 billion from Microsoft marking the largest single funding deal ever for an AI foundation model company. In that quarter, more than 75% of AI funding went to San Francisco Bay Area startups.

AI-related companies based in the Bay Area went on to raise more than $27 billion in 2023, up from $14 billion in 2022, when the regions companies raised 29% of all AI funding.

From a deal count perspective, Bay Area companies raised 17% of global rounds in this sector in 2023 making the region the leading metro area in the U.S. That is an increase over 13% in 2022.

The figure also represents more than a third of AI deal counts in the U.S., and means the Bay Area alone had more AI-related startup funding deals than all countries outside of the U.S.

Leading Bay Area-based foundation model companies OpenAI, Anthropic and Inflection AI have each raised more than $1 billion or much more and have established major real estate footprints in San Francisco.

OpenAI has closed on 500,000 square feet of office space in the citys Mission Bay district and Anthropic around 230,000 square feet in the Financial District.

From a leasing standpoint, [AI] is the bright spot in San Francisco right now, said Derek Daniels, a regional director of research in San Francisco for commercial real estate brokerage Colliers, who has been following the trends closely.

By contrast, big tech has been pulling back and reassessing space needs, he said.

According to Daniels, the citys commercial real estate market bottomed out in the second half of 2023. While the San Francisco office space market still faces challenges, there is quality sublet space which is also seeing some demand for smaller teams, he said. And some larger tenants who have been out of the picture for office space of 100,000 square feet or more are starting to come back.

Fifty percent of startups that graduated from the prestigious startup accelerator Y Combinators April batch were AI-focused companies.

Many of the founders who came to SF for the batch have decided to make SF home for themselves, and for their companies, Garry Tan, president and CEO of Y Combinator, said in an announcement of the accelerators winter 2024 batch.

YC itself has expanded its office space in San Franciscos Dogpatch neighborhood adjacent to Mission Bay. We are turning San Franciscos doom loop into a boom loop, Tan added.

Of the batch 34 companies that graduated in March from 500 Global, another accelerator, 60% are in AI. Its next batch is closer to 80% with an AI focus, said Clayton Bryan, partner and head of the global accelerator fund.

Around half of the companies in the recently graduated 500 Global batch are from outside the U.S., including Budapest, London and Singapore. But many want to set up shop in the Bay Area for the density of talent, events and know-how from hackathons, dinners and events, he said.

Startup investors also see the Bay Area as the epicenter for AI.

In the more recent crop of AI companies there is a real center of gravity in the Bay Area, said Andrew Ferguson, a partner at Databricks Ventures, which has been actively investing in AI startups such as Perplexity AI, Unstructured Technologies, Anomalo, Cleanlaband Glean.

The Bay Area does not have a lock on good talent. But theres certainly a nucleus of very strong talent, he said.

Databricks Ventures, the venture arm of AI-enhanced data analytics unicorn Databricks, has made five investments in AI companies in the Bay Area in the past six months. In total, the firm has made around 25 portfolio company investments since the venture arm was founded in 2022, largely in the modern data stack.

Freed from in-person office requirements during the pandemic, many young tech workers decamped from the expensive Bay Area to travel or work remotely in less expensive locales. Now, some are moving back to join the San Francisco AI scene.

Many young founders are just moving back to the Bay Area, even if they were away for the last couple of years, in order to be a part of immersing themselves in the middle of the scene, said Stephanie Zhan, a partner at Sequoia Capital. Its great for networking, for hiring, for learning about whats going on, what other products people are building.

Coincidentally, Sequoia Capital subleased space to OpenAI in its early days, in an office above Dandelion Chocolates in San Franciscos Mission District.

Zhan presumes that many nascent AI companies arent yet showing up in funding data, as they are still ideating or at pre-seed or seed funding, and will show up in future funding cycles.

While the Bay Area dominates for AI funding, its important to note the obvious: Much of that comes from a few massive deals to the large startups based in the region, including OpenAI, Anthropic and Inflection AI.

There is a lot of AI startup and research activity elsewhere as well, Zhan noted, with researchers coming out of universities around the globe, including cole Polytechnique in Paris, ETH Zrich and the University of Cambridge and Oxford University in the U.K., to name a few. Lead researchers from the University of Toronto and University of Waterloo have also fed into generative AI technology in San Francisco and in Canada, Bryan said.

While the U.S. has a strong lead, countries that are leading funding totals for AI-related startups outside of the U.S. are China, the U.K., Germany, Canada and France, according to Crunchbase data.

London-based Stability AI kicked off the generative AI moment before ChatGPT with its text-to-image models in August 2022. Open source model developer Mistral AI, based in Paris, has raised large amounts led by Bay Area-based venture capital firms Lightspeed Venture Partners and Andreessen Horowitz.

And in China, foundation model company Moonshot AI based in Beijing has raised more than $1 billion.

Still, the center of gravity in the Bay Area is driven by teams coming out of Big Tech or UC Berkeley and Stanford University who have a history of turning those ideas into startups, said Ferguson.

The unique congregation of Big Tech companies, research, talent and venture capital in the Bay Area has placed the region at the forefront of AI.

The valuation of the AI companies and some of the revenue by the top end of the AI companies is driving that population migration, said 500 Globals Bryan. At a recent AI event at Hana House in Palo Alto, California, he found it interesting that most people were not originally from the Bay Area. Everyone now wants a direct piece or an indirect piece of that value that is going into AI.

Illustration: Li-Anne Dias

Stay up to date with recent funding rounds, acquisitions, and more with the Crunchbase Daily.

See original here:

The SF Bay Area Has Become The Undisputed Leader In AI Tech And Funding Dollars - Crunchbase News

Recommendation and review posted by G. Smith

Android is getting an AI-powered scam call detection feature – The Verge

Posted: May 15, 2024 at 2:36 am

Google is working on new protections to help prevent Android users from falling victim to phone scams. During its I/O developer conference on Tuesday, Google announced that its testing a new call monitoring feature that will warn users if the person theyre talking to is likely attempting to scam them and encourage them to end such calls.

Google says the feature utilizes Gemini Nano a reduced version of the companys Gemini large language model for Android devices that can run locally and offline to look for fraudulent language and other conversation patterns typically associated with scams. Users will then receive real-time alerts during calls where these red flags are present.

Some examples of what could trigger these alerts include calls from bank representatives who make requests that real banks are unlikely to make, such as asking for personal information like your passwords or card PINs, requesting payments via gift cards, or asking users to urgently transfer money to them. These new protections are entirely on-device, so the conversations monitored by Gemini Nano will remain private, according to Google.

Theres no word on when the scam detection feature will be available, but Google says users will need to opt in to utilize it and that itll share more information later this year.

So, while the candidates who might find such tech useful are vast, compatibility could limit its applicability. Gemini Nano is only currently supported on the Google Pixel 8 Pro and Samsung S24 series, according to its developer support page.

Visit link:

Android is getting an AI-powered scam call detection feature - The Verge

Recommendation and review posted by G. Smith

GPT-4o delivers human-like AI interaction with text, audio, and vision integration – AI News

Posted: May 15, 2024 at 2:36 am

OpenAI has launched its new flagship model, GPT-4o, which seamlessly integrates text, audio, and visual inputs and outputs, promising to enhance the naturalness of machine interactions.

GPT-4o, where the o stands for omni, is designed to cater to a broader spectrum of input and output modalities. It accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs, OpenAI announced.

Users can expect a response time as quick as 232 milliseconds, mirroring human conversational speed, with an impressive average response time of 320 milliseconds.

The introduction of GPT-4o marks a leap from its predecessors by processing all inputs and outputs through a single neural network. This approach enables the model to retain critical information and context that were previously lost in the separate model pipeline used in earlier versions.

Prior to GPT-4o, Voice Mode could handle audio interactions with latencies of 2.8 seconds for GPT-3.5 and 5.4 seconds for GPT-4. The previous setup involved three distinct models: one for transcribing audio to text, another for textual responses, and a third for converting text back to audio. This segmentation led to loss of nuances such as tone, multiple speakers, and background noise.

As an integrated solution, GPT-4o boasts notable improvements in vision and audio understanding. It can perform more complex tasks such as harmonising songs, providing real-time translations, and even generating outputs with expressive elements like laughter and singing. Examples of its broad capabilities include preparing for interviews, translating languages on the fly, and generating customer service responses.

Nathaniel Whittemore, Founder and CEO of Superintelligent, commented: Product announcements are going to inherently be more divisive than technology announcements because its harder to tell if a product is going to be truly different until you actually interact with it. And especially when it comes to a different mode of human-computer interaction, there is even more room for diverse beliefs about how useful its going to be.

That said, the fact that there wasnt a GPT-4.5 or GPT-5 announced is also distracting people from the technological advancement that this is a natively multimodal model. Its not a text model with a voice or image addition; it is a multimodal token in, multimodal token out. This opens up a huge array of use cases that are going to take some time to filter into the consciousness.

GPT-4o matches GPT-4 Turbo performance levels in English text and coding tasks but outshines significantly in non-English languages, making it a more inclusive and versatile model. It sets a new benchmark in reasoning with a high score of 88.7% on 0-shot COT MMLU (general knowledge questions) and 87.2% on the 5-shot no-CoT MMLU.

The model also excels in audio and translation benchmarks, surpassing previous state-of-the-art models like Whisper-v3. In multilingual and vision evaluations, it demonstrates superior performance, enhancing OpenAIs multilingual, audio, and vision capabilities.

OpenAI has incorporated robust safety measures into GPT-4o by design, incorporating techniques to filter training data and refining behaviour through post-training safeguards. The model has been assessed through a Preparedness Framework and complies with OpenAIs voluntary commitments. Evaluations in areas like cybersecurity, persuasion, and model autonomy indicate that GPT-4o does not exceed a Medium risk level across any category.

Further safety assessments involved extensive external red teaming with over 70 experts in various domains, including social psychology, bias, fairness, and misinformation. This comprehensive scrutiny aims to mitigate risks introduced by the new modalities of GPT-4o.

Starting today, GPT-4os text and image capabilities are available in ChatGPTincluding a free tier and extended features for Plus users. A new Voice Mode powered by GPT-4o will enter alpha testing within ChatGPT Plus in the coming weeks.

Developers can access GPT-4o through the API for text and vision tasks, benefiting from its doubled speed, halved price, and enhanced rate limits compared to GPT-4 Turbo.

OpenAI plans to expand GPT-4os audio and video functionalities to a select group of trusted partners via the API, with broader rollout expected in the near future. This phased release strategy aims to ensure thorough safety and usability testing before making the full range of capabilities publicly available.

Its hugely significant that theyve made this model available for free to everyone, as well as making the API 50% cheaper. That is a massive increase in accessibility, explained Whittemore.

OpenAI invites community feedback to continuously refine GPT-4o, emphasising the importance of user input in identifying and closing gaps where GPT-4 Turbo might still outperform.

(Image Credit: OpenAI)

See also: OpenAI takes steps to boost AI-generated content transparency

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: ai, api, artificial intelligence, benchmarks, chatgpt, coding, developers, development, gpt-4o, Model, multimodal, openai, performance, programming

Original post:

GPT-4o delivers human-like AI interaction with text, audio, and vision integration - AI News

Recommendation and review posted by G. Smith


Page 11234..1020..»