日报607 Gemini：Google发布全面碾压GPT-4

- [Google 史上最强大模型 Gemini，真的全面「碾压」GPT-4 吗？](https://readwise.io/reader/shared/01hh1q22gwb4515r7kprcb23pa) - [Google says its Gemini AI outperforms both GPT-4 and expert humans](https://readwise.io/reader/shared/01hh16n9xyd0nv8re8jwjqzj5e) - [谷歌称Gemini开创“原生多模态”时代，但演示视频被指夸大性能](https://readwise.io/reader/shared/01hh3r2f9xkepfvwdfsabc8btr) - [谷歌推出其最先进AI模型Gemini，希望击败GPT-4](https://readwise.io/reader/shared/01hh18n38b2adr7gsce8p1x8vs) Q 总结与Google Gemini有关的知识要点。费曼你的任何思考与发现。 - 多模态：Gemini 在建立之初就是多模态模型，能够理解、操作、结合不同类型的信息，包括文本、代码、音频、图像和视 - Gemini 1.0 分为 3 种规格 - 中杯：Gemini Nano —— 最高效的设备任务模型 - 大杯：Gemini Pro —— 适用于广泛的任务扩展的最佳模型 - 超大杯：Gemini Ultra ——最大且最能胜任高度复杂任务的模型 - 据Google 官方的数据，Gemini Ultra 在32 个 LLM 基准测试中，30 项领先，在文本、常规推理、数学、代码等领域超过了 GPT-4 - 在 MMLU（测试 AI 模型**知识**和**解决问题**能力）大规模多任务语言理解的测试中，Gemini ultra 的测试结果准确率为 90%，GPT-4 的准确率为 86.4% - 和 Duolingo的模式类似，通过将 Gemini 与手机这种人人都有的设备结合，可以让民众迅速接触到新的技术。Gemini Nano 已在 Pixel 8 Pro上运行 - GPT-4 1.76 万亿个参数，GPT-3为 1750 亿个参数，LaMDA1370 亿个参数，Gemini pro 5400 亿个参数，Gimini Ultra??? - 按照[[AGI]]的五级分类，Gemini的演示视频展现了它能看、能说、能推理、能感知和表达简单的情感和价值，让我们看到了达到 AGI-1 的潜在可能性 # 与我何干 1. OpenAI 和 Google，就像当年的微软和苹果，在互相竞争中取得技术的进步。这样来看竞争是似乎正向的。 2. 人工智能的学习越来越像人类的学习，素材可以包括文本、视频、音频和代码，学习方法是通过给定的目标自我学习 3. 从 MMLU 人工智能模型的测试中，我们也可以反观衡量人类是否是专家的参考，一个人所具备的知识量与解决问题的能力 4. Google 的手机，微软的搜索引擎、office 软件不知道会给独占市场多年的苹果手机，Google chrome 带来怎样的冲击😂，但这些都能证明 AGI将不再是一个遥不可及的新技术，而是融入到每个人生活中方方面面的新技术 # 参考资料 GPT-4, the latest iteration of the Generative Pre-trained Transformer series developed by OpenAI, represents a significant advancement in terms of its scale and capabilities. While OpenAI has not officially disclosed the precise size of the model, credible sources and rumors suggest that GPT-4 possesses approximately 1.76 trillion parameters[](https://en.wikipedia.org/wiki/GPT-4)[](https://the-decoder.com/gpt-4-architecture-datasets-costs-and-more-leaked/). This is a substantial increase compared to its predecessors, with GPT-3 having 175 billion parameters and GPT-3.5 further refining this model[](https://ambcrypto.com/blog/the-ultimate-guide-to-gpt-4-parameters-a-complete-overview/). GPT-4's architecture is reportedly based on the "Mixture of Experts" (MoE) model. This type of ensemble learning combines different models, referred to as "experts," for decision-making. In an MoE model, a gating network determines the weight of each expert's output based on the input, allowing different experts to specialize in different parts of the input space. This approach is particularly effective for handling large and complex datasets[](https://the-decoder.com/gpt-4-architecture-datasets-costs-and-more-leaked/). The increased number of parameters in GPT-4 allows for enhanced capabilities compared to earlier versions. For instance, GPT-4 exhibits significant improvements in understanding and generating language, processing and interpreting images, and retaining context over longer conversations. It also demonstrates superior performance in various professional and academic benchmarks, such as simulated bar exams and SATs, compared to GPT-3.5[](https://ar5iv.org/abs/2303.08774). These advancements in GPT-4 highlight the continuous evolution and scaling of AI language models, offering more sophisticated and nuanced capabilities with each iteration.