To translate "Multimodal Model: An AI that can read, see, and listen" into simplified Chinese while keeping the HTML structure, you can use the following: ```html 多模型:一种可以阅读、观察和听取的人工智能 ``` This HTML structure ensures that the text "多模型:一种可以阅读、观察和听取的人工智能" is marked as simplified Chinese language content (`lang="zh-CN"`).

```html

人工智能(AI)已经成为我们日常生活的一部分,从处理琐碎任务(如找到适合约会的最佳餐厅)到在工作中提供个人辅助服务,无所不包。虽然AI尚未达到科幻电影(如《2001太空漫游》和《少数派报告》)中描绘的水平,但虚幻与现实之间的鸿沟显著缩小。2022年11月30日,由OpenAI推出的ChatGPT [1] 进行了重大飞跃,将“大型语言模型(LLM)”这一术语深深印在每个人的脑海中。LLM是一类AI,展示了前所未有的能力,能理解人类的书写并做出相应回应。从那时起,大量全新的AI相继涌现,例如谷歌的Gemini [2] [3] [4]、Meta的LLAMA [5] [6]、Mistral AI的Mixtral [7] 等等,应用和用例也随之出现,源自这种理解能力。LLM使各种应用变为可能。客户与由LLM驱动的客户服务聊天机器人进行愉快对话。程序员使用LLM辅助编码。企业家和高管利用LLM总结世界事件,进行明智决策。这些多样化的应用源自一个基本能力:“阅读”文本的能力。这种能力曾经只存在于小说中,如今正在改变我们的生活。

``` This HTML structure contains the translated text in simplified Chinese, maintaining the structure specified in your request.

Sure, here's the translated text in simplified Chinese while keeping the HTML structure: ```html 写作和阅读对于表达和理解思想、意图和指示至关重要,但词语并不总是传递信息的最有效方式。视觉形象是另一种强大的沟通方法,可以说早于口头语言的出现。 ``` This translation maintains the original meaning while conveying it in simplified Chinese.

Figure 1: Prehistoric art from Cueva de las Manos, Argentina. While it is unclear if the intention behind this painting is communication, it shows that depiction is part of the history of mankind since the dawn of humanity. Image courtesy of Mariano, CC BY-SA3.0, via Wikimedia Commons. URL: https://en.wikipedia.org/wiki/Cave_painting

Sure, here's the translated text in simplified Chinese, while keeping the HTML structure intact: ```html

具备理解图像能力的人工智能已经存在相当长一段时间了。例如,AI可以定位并突出显示图像中的对象(实例分割),并生成图像的简短描述(图像字幕)。然而,这些AI通常仅限于其指定的任务。基于这些成就,下一个逻辑步骤是推动开发一个能够执行多项任务,包括某些推理能力的AI,类似于LLM。

``` This translates the English text into simplified Chinese and preserves the structure suitable for HTML content.
Figure 2: Instance Segmentation is an AI technique that identifies the boundaries of objects in an image and highlights them, as demonstrated by the Segment Anything Model (SAM) by Meta [8]. Image courtesy of The Segment Anything Team, Meta. URL: https://ai.meta.com/blog/segment-anything-foundation-model-image-segmentation/
Figure 3: Image captioning is an AI technique that generates text descriptions for images. For example, OpenAI’s CLIP model [9] might generate the following description from the image ‘A zoomed-in photo of a red and white triangle with exclamation mark warning traffic sign’. Image courtesy of S. Houben and J. Stalkamp, GTSRB dataset. URL: https://www.kaggle.com/datasets/harbhajansingh21/german-traffic-sign-dataset

```html

多模态模型代表了人工智能发展的重大飞跃,因为它们可以接收多种格式的输入,如文本、图像和音频,并执行需要理解来自不同模态输入的复杂任务。多模态使得人工智能能够处理以前不可能完成的任务,例如从复杂图表中总结业务见解、从演示文稿中搜索特定信息以及从图像中提取信息。以前,这些任务需要单独的专用人工智能来完成。例如,一个人工智能可能专门用于业务图表解读,仅根据输入的图表提供见解。另一个人工智能可以专门用于数据提取,仅将图像转换为数值表格。相比之下,多模态提供了一个统一的应用程序,可以解释和处理来自不同来源的信息,包括根据用户指令执行特定任务。

``` This HTML structure includes the translated text in simplified Chinese while maintaining the basic HTML formatting for text content.
Figure 4: Do you remember those days when you would spend 15 minutes finding the right remote control? Then you would spend another half hour choosing a film to watch with just only an hour of free time. Wouldn’t it be irritating to have to search through your collection of AIs just to find the right one to assist you with your work? Image courtesy of J. Bakker Ijmuiden, CC BY 2.0, via Wikimedia Commons. URL: https://commons.wikimedia.org/wiki/File:Remote_control_(51887536992).jpg

```html

多模型是通过大量的数据进行训练,包括文本、图像和音频。在这个过程中,模型学习识别不同格式的输入和输出之间的语义关系。例如,模型可以将一个单词“hello”的音频片段与其文本表示相关联,或者识别图像、音频和描述雷暴的文本描述作为同一事件相关的方面。理解不同输入和输出格式之间的连接能力使模型能够解释用户的输入并做出相应的响应。

``` This HTML structure contains the translated text in simplified Chinese, maintaining the original HTML formatting.

Sure, here is the text translated into simplified Chinese while keeping the HTML structure: ```html 通过缩小人工智能和类人理解之间的差距,多模态模型为利用技术开辟了新的可能性。这些模型使得机器能够更全面、更复杂地处理信息并响应我们的需求。 ``` This translation maintains the HTML structure while presenting the text in simplified Chinese characters.

Sure, here's the translation in simplified Chinese while maintaining the HTML structure: ```html

示例用例 1:业务洞察

```

Sure, here's the translated text in simplified Chinese, keeping the HTML structure: ```html

在21世纪初期,大数据的兴起彻底改变了企业决策,强调了数据的价值和获取工具的必要性。数据使企业能够基于事实、指标和洞见做出战略决策,这个过程被称为数据驱动决策。然而,要获取这些数据需要显著的技术专业知识,将日常语言转化为复杂的机器语言查询。一旦获得结果,通常还需要技术解读将其转化为业务洞见。

``` This HTML snippet contains the translated text in simplified Chinese, suitable for integration into a webpage while maintaining the original structure.
Figure 5: This SQL command for PostgreSQL calculates how much, on average, each type of member spent in May 2024. It uses technical terms that are not obvious to most audiences, except technical geeks. Image courtesy of the author.

```html

业务领导人每天面临挑战,需要做出许多决策。获取驱动决策所需的正确数据是一个重要步骤,可能需要很多时间,而且决策往往不能等待。这就是多模型进入舞台的方式。通过理解和将人类语言转化为机器语言进行数据处理,这些模型使得数据仓库中的信息能够通过自然语言访问。此外,检索到的数据可能复杂且难以理解。坦率地说,大多数人无法理解那些数据。多模型还可以将数据以更简单的格式呈现,例如图表或图形,以及突出显示关键信息并引导用户做出正确决策的摘要。

```
Figure 6: (Left) This example of a business report shows the monthly sales performance of three fictitious SKUs. Image courtesy of the author. (Right) The response from Gemini-1.5-flash-001 when asked the question “Which month has the highest sales for each SKU?”. The response is obtained from Multimodal Prompt Design on Google Cloud Console: https://console.cloud.google.com/vertex-ai/generative/multimodal/gallery. Image courtesy of Google Cloud Console.

Sure, here's the translated text in simplified Chinese while keeping the HTML structure intact: ```html 想象一位高管试图了解去年3种SKU的销售表现,多模型方法可以让高管简单地问:“哪个月份SKU A、B和C的销售额最高?” 然后模型会呈现结果:SKU A在四月表现最佳,SKU B在十月,SKU C也是在四月。这提供了一个快速、简单且易于理解的响应。 ``` This HTML code preserves the structure of the text while displaying it in simplified Chinese.

Sure, here's the text translated to simplified Chinese while keeping the HTML structure intact: ```html 示例用例2:对图像的人类化响应 ``` In this HTML snippet: - `` specifies that the enclosed text is in simplified Chinese. - `示例用例2:对图像的人类化响应` is the translated text.

```html

高级语言模型(如GPT4)已经展示出它们具备像人类一样做出决策并进行对话回应的能力[10] [11] [12]。利用这种能力,语言模型已被用作各种任务的助手,例如起草正式文章,将冗长乏味的段落总结成简短精炼的句子,以及从文本中分析情感和意图。由语言模型驱动的助手擅长理解用户指令并生成类似人类交流的回应。

``` This HTML structure preserves the text while adding basic formatting.

Sure, here's the translation of the provided text into simplified Chinese, maintaining the HTML structure: ```html 多模态将LLM的能力推向更进一步,使它们能够模拟人类的反应,能够理解文本和视觉线索。助理现在可以分析图片,并根据用户的指令提供反馈。类似于起草文章,用户可以要求制作网页或广告横幅的草稿。LLM可以处理图片并生成描述图片的摘要,如短段落,包括从图片中提取信息。它们还能够根据自己理解的视觉线索对图片进行分类。 ``` This text maintains the original structure and content while translating it into simplified Chinese.

Figure 7: (Top Left) An input image of shawarma, one of the author’s already-too-many favorite foods. Image courtesy of the author. (Top Right) A response from ChatGPT-4o when asked the question “Is this a good presentation of food to put on a promotion pamphlet?“. The response is obtained from ChatGPT playground: https://platform.openai.com/playground/. Image courtesy of OpenAI platform (Bottom) A response from Gemini-1.5-flash-001 when asked the same question which is obtained from Multimodal Prompt Design on Google Cloud Console: https://console.cloud.google.com/vertex-ai/generative/multimodal/gallery. Image courtesy of Google Cloud Console.

Sure, here's the translated text in simplified Chinese, while keeping the HTML structure intact: ```html

尽管这些能力在计算机视觉人工智能领域可能并不像是一项突破性的进展,但每个任务,以及其他任务,都需要专门的人工智能。多模态使得一个复杂的人工智能可以根据用户的指令处理各种任务,有效地用一个智能体取代了一群助手,帮助处理许多日常活动。

``` This HTML snippet includes the translated text in simplified Chinese, formatted as a paragraph (`

` tag) within an HTML structure.

```html

多模态性超越文本和图像,音频和视频正迅速成为多模态模型发展的下一个重点。来自学术界到私营部门的研究人员和开发者一直在不断开发和改进这些人工智能,以持续增强它们的能力。例如,Nvidia 的 Instant NeRF 可以从几张图像生成逼真的视频序列,创建3D对象的投影[13]。类似地,Google Research 的 VideoPoet 可以从单张图像创建短视频[14]。这些进展预示着新一波人工智能的到来,带来了令人兴奋的可能性和富有想象力的应用。

```

Sure, here's how you can write "References" in simplified Chinese within an HTML structure: ```html
参考文献
``` This HTML code includes the simplified Chinese characters for "References" wrapped in a `
` tag, assuming it's being used in the context of a figure caption.

Sure, here's the translation in simplified Chinese while keeping the HTML structure: ```html [1] OpenAI,介绍ChatGPT(2022年)网址:https://openai.com/index/chatgpt/ 检索日期:2024年5月24日 ``` In simplified Chinese characters, it reads: ``` [1] OpenAI,介绍ChatGPT(2022年)网址:https://openai.com/index/chatgpt/ 检索日期:2024年5月24日 ```

To translate the given citation into simplified Chinese while maintaining HTML structure, you can format it like this: ```html [2] S. Pichai and D. Hassabis, Introducing Gemini: Our Largest and Most Capable AI Model (2023) URL: https://blog.google/technology/ai/google-gemini-ai/ Retrieved: May 24th, 2024 ``` And the translated text in simplified Chinese: ```html [2] S. Pichai 和 D. Hassabis, 介绍 Gemini:我们最大和最有能力的 AI 模型(2023)URL:https://blog.google/technology/ai/google-gemini-ai/ 检索日期:2024年5月24日 ``` This maintains the structure and provides the translated content in Chinese.

Sure, here's the translation of the citation into simplified Chinese, keeping the HTML structure intact: ```html

[3] Gemini Team, Gemini: A Family of Highly Capable Multimodal Models (2024) arXiv: 2312.11805

``` Translated to simplified Chinese: ```html

[3] Gemini团队,《Gemini:一系列高性能多模型》(2024)arXiv: 2312.11805

```

Sure, here's how you can structure and translate the text into simplified Chinese while keeping the HTML format intact: ```html

[4] Gemini Team, Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (2024) arXiv: 2403.05530

``` Translated into simplified Chinese: ```html

[4] Gemini 团队,Gemini 1.5:在数百万个上下文标记中解锁多模态理解(2024)arXiv: 2403.05530

``` This HTML snippet maintains the structure of the citation while presenting the translated text in simplified Chinese.

Sure, here's the text translated to simplified Chinese while keeping the HTML structure: ```html [5] H. Touvron et al., LLaMA: 开放高效的基础语言模型 (2023) arXiv: 2302.13971 ```

Certainly! Here's the translated text in simplified Chinese, formatted in HTML structure: ```html

[6] Meta Introducing Meta Llama3: The Most Capable Openly Available LLM to Date (2024) URL: https://ai.meta.com/blog/meta-llama-3/ Retrieved: May 24th, 2024

``` Translated text: ```html

[6] Meta 发布 Meta Llama3:迄今为止最强大的开放式LLM(2024年)URL:https://ai.meta.com/blog/meta-llama-3/ 检索日期:2024年5月24日

```

Sure, here's the text translated into simplified Chinese while keeping the HTML structure: ```html [7] A. Q. Jiang 等人,《专家混合模型》(2024),arXiv: 2401.04088 ``` In this translation: - "A. Q. Jiang et al." is rendered as "A. Q. Jiang 等人", which means "A. Q. Jiang and others". - "Mixtral of Experts" is directly translated as "专家混合模型", which means "Mixture of Experts". - "(2024)" remains in English as it represents the year. - "arXiv: 2401.04088" remains in English as it is a specific identifier. This format maintains the citation in the requested structure with the Chinese translation provided.

Sure, here is the translation of the citation into simplified Chinese, keeping the HTML structure intact: ```html [8] A. Kirillov et al., Segment Anything (2023) arXiv: 2304.02643 ``` Translated text: ```html [8] A. Kirillov等人,《Segment Anything (2023)》arXiv: 2304.02643 ```

Sure, here's the text translated into simplified Chinese, formatted in HTML structure: ```html [9] A. Radford et al., Learning Transferable Visual Models From Natural Language Supervision (2021) arXiv: 2103.00020 ``` Translated text in Chinese: ```html [9] A. Radford 等人,《从自然语言监督中学习可转移视觉模型(2021)》,arXiv: 2103.00020 链接 ``` This maintains the structure and adds a link for reference.

Sure, here's the translation of the citation into simplified Chinese: ```html [10] J. S. Park等人,《生成代理:人类行为的交互模拟体》(2023),arXiv: 2304.03442 ``` In this HTML structure, the citation remains in English, while the translated title and details are provided in simplified Chinese.

Sure, here's the translation of the citation into simplified Chinese, keeping HTML structure intact: ```html [11] G. Simmons 和 C. Hare, 大语言模型作为子群体代表模型的综述 (2023) arXiv: 2310.17888 ```

Sure, here's the text translated into simplified Chinese while keeping the HTML structure: ```html [12] C. R. Jones and B. K. Bergen,人们在图灵测试中无法区分GPT-4和人类(2024)arXiv: 2405.08007。 ```

To translate "[13] J. Stephens, Getting Started with NVIDIA Instant NeRFs https://developer.nvidia.com/blog/getting-started-with-nvidia-instant-nerfs/" into simplified Chinese while keeping the HTML structure, you would write: ```html [13] J. Stephens, Getting Started with NVIDIA Instant NeRFs ``` In simplified Chinese characters, it would be: ```html [13] J. Stephens, 使用 NVIDIA Instant NeRFs 入门 ``` This preserves the citation format with a clickable link in HTML.

Sure, here is the translated text in simplified Chinese, maintaining the HTML structure: ```html [14] D.Kondratyuk 等人,《VideoPoet,一种用于零样本视频生成的大型语言模型》,https://sites.research.google/videopoet/ ``` This HTML structure ensures that the citation is formatted properly and the link remains functional in the text.

To keep the HTML structure intact while translating the text "Originally published at https://www.sertiscorp.com/sertis-ai-research" into simplified Chinese, you can use the following HTML markup: ```html

原始发布于 https://www.sertiscorp.com/sertis-ai-research

``` This HTML code creates a paragraph (`

`) with a link (``) to the specified URL in Chinese characters.

2024-06-28 04:18:26 AI中文站翻译自原文