Sure, here's the translation of "Topic Modeling Made Easy with Large Language Models" into simplified Chinese while keeping the HTML structure: ```html 主题建模与大型语言模型的简易实现 ```

Sure, here's the translated text in simplified Chinese, while keeping the HTML structure intact: ```html

主题建模长期以来一直是一项复杂且耗时的任务,需要相当的专业知识。然而,大语言模型(LLMs)的兴起为简化和自动化这一过程打开了大门,使其可以被更广泛的受众所接触。在本文中,我们将探讨一种利用LLMs进行简便而有效的主题建模方法。

``` This HTML snippet preserves the original text while presenting it in simplified Chinese.
In my previous article, I discussed how topic modeling remains inaccessible to most people and proposed a radically different methodology using Excel add-in. However, I was still uncertain about the suitability of this novel approach. With the June 21st release of the add-in supporting text embeddings, I can now explain the technique more objectively.

在保持HTML结构的情况下,将以下英文文本翻译为简体中文: 获取正确的数据集

To translate the provided English text into simplified Chinese while keeping the HTML structure intact, you can use the following: ```html

第一步是获取适当的文本数据集。虽然有许多选择,但我们选择了《赫芬顿邮报》新闻类别数据集来进行这个演示。该数据集包含了2012年至2022年间的文章,包括发布日期、类别、作者、标题以及文章内容,所有内容都整齐地打包在24万个JSON条目中。

``` This HTML snippet includes the translated Chinese text wrapped in `

` tags to maintain the structure for displaying paragraphs in HTML.

Thanks to Rishabh Misra for compiling this useful dataset and releasing it under a CC BY 4.0 license.

Sure, here is the HTML structure with the simplified Chinese translation: ```html

由于完整数据集过大,Excel 处理起来有困难,而付费 API 的成本也较高,因此我们专注于提取 2022 年 2 月至 9 月间最近的一千篇文章。JSON 格式易于解析,因此通过一个快速的 C# 脚本,我们提取出所需字段并将其转换为 CSV 格式。

``` This HTML structure contains the translated text in simplified Chinese while preserving the original English structure.
While tourists in Tokyo are enjoying $7 lunches thanks to the weak yen, these dollar-denominated prices remain steep for me.

在保留HTML结构的情况下,将英文文本“Exploring the Raw Topics”翻译成简体中文为: 探索原始主题

在保持HTML结构的情况下,将以下英文文本翻译为简体中文: 要查看原始数据中代表哪些主题,我们可以使用这个Excel公式来生成嵌入:

Sure, here's the translation in simplified Chinese, while keeping the HTML structure: ```html =BB.EMBEDDING(CONCAT([@headline],CHAR(10),[@[short_description]]),1536) ``` Translated to Chinese: ```html =BB.EMBEDDING(CONCAT([@headline],CHAR(10),[@[short_description]]),1536) ```

Sure, here's the HTML structure with the translated text in simplified Chinese: ```html

这使用了默认的“text-embedding-3-small”模型,最大输出维度为1536。在TensorFlow嵌入投影仪中可视化生成的向量显示数据集中存在一些主题聚类。

``` In this HTML snippet: - `

` tags denote a paragraph. - The Chinese text translates the provided English sentence accurately while preserving the HTML structure.

在保持HTML结构的前提下,将“Validating the Provided Categories”翻译为简体中文。

Sure, here is the translated text in simplified Chinese while keeping the HTML structure: ```html

新闻分类数据集带有一个“category”字段,将每篇文章分配到42个预定义主题中的一个。作为一个快速的合理性检查,让我们看看GPT-3.5 Turbo是否同意这些标签:

``` This HTML snippet maintains the structure of the original text while presenting the translation in simplified Chinese.

To translate the provided text " =BB.CATEGORIZE(CONCAT([@headline],CHAR(10),[@[short_description]]),Category[category]) " into simplified Chinese while keeping the HTML structure intact, you can use the following: ```html =BB.CATEGORIZE(CONCAT([@headline],CHAR(10),[@[short_description]]),Category[category]) ``` Now, translated into simplified Chinese: ```html =BB.分类(CONCAT([@headline],CHAR(10),[@[short_description]]),Category[category]) ``` This maintains the original HTML structure while presenting the translation in simplified Chinese.

以下是使用HTML结构的简体中文翻译: ```html

令人印象深刻的是,GPT-3.5 在一千篇文章中有569篇与人工提供的类别匹配。这些差异可能源于原始标签的主观性质,而人工智能仅基于标题和摘录文本进行标注。总体而言,这证实了这些类别通常是合理的。

```

自动化主题发现

Sure, here's the translated text in simplified Chinese while maintaining the HTML structure: ```html 现在是有趣的部分 — 使用人工智能自动发现主题。首先,我们促使 GPT 建议每篇文章的类别: ``` This HTML snippet preserves the original structure while displaying the translated Chinese text.

To translate the given text into simplified Chinese while keeping the HTML structure intact, you can use the following: ```html =BB.ASK(CONCAT("以下文本是来自美国新闻网站的一篇文章。请用一个词为这篇文章选择合适的类别。", CHAR(10), CHAR(10), "文本:", CHAR(10),[@headline], CHAR(10),[@[short_description]])) ``` This HTML snippet maintains the formula structure (`=BB.ASK(CONCAT(...)`) and translates the English text into simplified Chinese.

Certainly! Here is the HTML structure with the translated text in simplified Chinese: ```html

从这一千篇文章中提取了129个独特的主题。为了整合这些主题,我们将主题列表输入到GPT中,使用以下提示:

``` Translated text: "从这一千篇文章中提取了129个独特的主题。为了整合这些主题,我们将主题列表输入到GPT中,使用以下提示:"

Sure, here's the translation in simplified Chinese while keeping the HTML structure: ```html

将此文件中的主题分组到适合美国新闻网站的类别中。

```

在保持HTML结构的情况下,将以下英文文本翻译成简体中文: 模型成功将129个主题合并为仅8个高级类别。最后,我们将这些类别应用于数据集:

Sure, here's the translation of the text you provided into simplified Chinese: =BB.CATEGORIZE(CONCAT([@headline],CHAR(10),[@[short_description]]),Topic[topic]) It would be: =BB.分类(CONCAT([@headline],CHAR(10),[@[short_description]]),Topic[topic])

Sure, here's the translation of the text into simplified Chinese while keeping the HTML structure: ```html 在嵌入式投影仪中可视化此内容显示出不同类别之间的明显分离,类似的文章被紧密聚集在一起。人工标记中固有的主观性被减少,接近度反映了嵌入结果。 ``` Feel free to use this HTML structure with the translated Chinese text as needed.

Certainly! Here is "Conclusion" translated into simplified Chinese while maintaining HTML structure: ```html 结论 ```

```html

主题建模曾经是一门需要专业知识的神秘艺术。但是随着生成式人工智能的出现,即使是中学生也能进行有用的主题建模 —— 只要他们知道正确的提示。虽然传统的主题建模技术尚未过时,但危机已经来临。像许多数据科学任务一样,主题建模正因大语言模型的力量而变得更加可及和自动化。未来对于普及这种有价值的文本分析能力充满希望。

``` This HTML structure contains the translated text in simplified Chinese while maintaining the content's original structure.

2024-06-23 04:23:55 AI中文站翻译自原文