AI视野今日CS.NLP 自然语言处理论文速览 Fri, 29 Sep 2023 Totally 45 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
MindShift: Leveraging Large Language Models for Mental-States-Based Problematic Smartphone Use Interve…
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 17 Jan 2024 (showing first 100 of 163 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Deductive Closure Training of Language Models for Coherence, Accur…
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 10 Jan 2024 Totally 38 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Model Editing Can Hurt General Abilities of Large Language Models Authors Jia Chen Gu, Hao Xiang Xu, J…
Figure 1: Search volumes for “large language models”
近几个月来,大型语言模型(LLM)引起了很大的轰动(见图1)。这种需求导致了利用语言模型的网站和解决方案的不断开发。ChatGPT在2023年1月创下了用户群增长最快…
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 26 Sep 2023 Totally 75 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction Authors Zeyuan Allen Zhu, Yuanz…
大语言模型的学习材料
Other Papers
If you’re interested in the field of LLM, you may find the above list of milestone papers helpful to explore its history and state-of-the-art. However, each direction of LLM offers a unique set of insights and contribut…
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 31 Oct 2023 (showing first 100 of 141 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
The Eval4NLP 2023 Shared Task on Prompting Large Language Models a…
AI视野今日CS.NLP 自然语言处理论文速览 Mon, 1 Jan 2024 Totally 42 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Principled Gradient-based Markov Chain Monte Carlo for Text Generation Authors Li Du, Afra Amini, Lucas…
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 3 Oct 2023 (showing first 100 of 110 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Its MBR All the Way Down: Modern Generation Techniques Through the …
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 18 Jan 2024 Totally 35 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics …
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 27 Sep 2023 Totally 50 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models Authors Mert …
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 24 Oct 2023 (showing first 100 of 207 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
LINC: A Neurosymbolic Approach for Logical Reasoning by Combining …
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 17 Oct 2023 (showing first 100 of 135 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Step-by-Step Remediation of Students Mathematical Mistakes Authors…
LLMs:《Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca》翻译与解读 目录
相关文章
LLMs:《Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca》翻译与解读
LLMs:在单机CPUWindows系统上实现中文…
我是一个特殊的机器人助手,名字叫做LLM(Large Language Model)。想象一下,你知道电脑是怎么帮助人们做各种事情的吧?LLM就是一种非常聪明的电脑程序,它被训练得非常聪明,可以回答各种各样的问题…
文章目录 一、前言二、主要内容三、总结🍉 CSDN 叶庭云:https://yetingyun.blog.csdn.net/ 一、前言 ICLR 2024 高分论文:《Step-Back Prompting Enables Reasoning Via Abstraction in Large Language Models》
论文地址:https://openreview.net/forum?id=3bq3jsvcQ1
…
前言
Softmax函数 SoftMax ( x ) i e x i e x 1 ∑ j 2 N e x j , x 1 ≫ x j , j ∈ 2 , … , N \text{SoftMax}(x)_i \frac{e^{x_i}}{e^{x_1} \sum_{j2}^{N} e^{x_j}}, \quad x_1 \gg x_j, j \in 2, \dots, N SoftMax(x)iex1∑j2Nexjexi,x1≫xj,j∈2,……
AI视野今日CS.NLP 自然语言处理论文速览 Fri, 6 Oct 2023 Totally 44 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning Authors Ke Wang, Houxi…
AI视野今日CS.NLP 自然语言处理论文速览 Mon, 16 Oct 2023 Totally 53 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming Au…
AI视野今日CS.NLP 自然语言处理论文速览 Mon, 8 Jan 2024 Totally 17 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Authors DeepSeek AI Xiao Bi, Deli Ch…
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 10 Oct 2023 (showing first 100 of 172 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Few-Shot Spoken Language Understanding via Joint Speech-Text Model…
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 3 Jan 2024 Totally 24 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction Authors Zaratiana Ur…
现在已经是12月了,距离2024年只有一个月了,本文总结了11月的一些比较不错的大语言模型相关论文
System 2 Attention (is something you might need too).
https://arxiv.org/abs/2311.11829
一种称为S2A的新注意力方法被开发出来,解决llm…
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 5 Oct 2023 Totally 50 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Retrieval meets Long Context Large Language Models Authors Peng Xu, Wei Ping, Xianchao Wu, Lawrence McA…
论文标题: Making Large Language Models Perform Better in Knowledge Graph Completion 论文链接: https://arxiv.org/abs/2310.06671 代码链接:GitHub - zjukg/KoPA: [Paper][Preprint 2023] Making Large Language Models Perform Be…
从open AI 的论文可以看到,大语言模型的优化,分下面三个步骤,SFT,RM,PPO,我们跟随大神的步伐,来学习一下这三个步骤和代码实现,本章介绍PPO实践。
生活中,我们经常会遇到…
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 4 Jan 2024 Totally 29 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Multilingual Instruction Tuning With Just a Pinch of Multilinguality Authors Uri Shaham, Jonathan Herzi…
AI视野今日CS.NLP 自然语言处理论文速览 Tue, 9 Jan 2024 Totally 80 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Infe…
摘要 以往的LLM(Large Languages Models)研究都遵从一个假设,即更多的参数将导致更好的性能。但也发现,给定计算预算限制后,最佳性能的模型不是参数最大的,而是数据更多的。对于实际场景,首选的…
文章目录 一、前言二、主要内容1. 引言2. 方法3. 结果三、总结🍉 CSDN 叶庭云:https://yetingyun.blog.csdn.net/ 一、前言 Foundation Models for Decision Making Workshop at NeurIPS 2023:Using Large Language Models for Hyperparameter Optimization
Insight:我们…
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 25 Oct 2023 (showing first 100 of 112 entries) Totally 100 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft R…
AI视野今日CS.NLP 自然语言处理论文速览 Fri, 27 Oct 2023 Totally 80 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case …
DeepMind的大模型Gopher《Scaling Language Models: Methods, Analysis & Insights from Training Gopher》论文:https://arxiv.org/pdf/2112.11446.pdf 相关博客 【自然语言处理】【大模型】DeepMind的大模型Gopher 【自然语言处理】【大模型】Chinchilla&…
Editing Large Language Models: Problems, Methods, and Opportunities
论文链接 代码链接
摘要
由于大语言模型(LLM)中可能存在一些过时的、不适当的和错误的信息,所以有必要纠正模型中的相关信息。如何高效地修改模型中的相关信息而不影…
llm对文本指令非常有用,但是如果我们尝试向模型提供某种文本格式的表格数据和该表格上的问题,LLM更有可能产生不准确的响应。
在这篇文章中,我们将介绍微软发表的一篇研究论文,“Table-GPT: Table- tuning GPT for Diverse Table…
AI视野今日CS.NLP 自然语言处理论文速览 Fri, 1 Mar 2024 Totally 67 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling Authors G…
文章目录~ 1.Gecko: Versatile Text Embeddings Distilled from Large Language Models2.Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference3.LUQ: Long-text Uncertainty Quantification for LLMs4.Draw-and-Understand: Leveraging Visua…
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 28 Sep 2023 Totally 38 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing Authors Brian Yan,…
一、背景由来
过去几年里,以ChatGPT为代表的基于prompt范式的大型语言模型 (Large Language Model,LLM) 取得了巨大的成功。然而,对生成结果的评估是主观和依赖上下文的,这些结果难以用现有的基于规则的文本生成指标 (如 BLUE 和…
ChatGPT已经成为家喻户晓的名字,而大语言模型在ChatGPT刺激下也得到了快速发展,这使得我们可以基于这些技术来改进我们的业务。
但是大语言模型像所有机器/深度学习模型一样,从数据中学习。因此也会有garbage in garbage out的规则。也就是说…
这里是MIT 6.5940 Fall 2023的第一个实验Lab1的一些笔记,课程传送门:Han Lab
Setup
First, install the required packages and download the datasets and pretrained model. Here we use CIFAR10 dataset and VGG network which is the same as what…
欢迎关注我的CSDN:https://spike.blog.csdn.net/ 本文地址:https://blog.csdn.net/caroline_wendy/article/details/136617643 大语言模型(LLM, Large Language Model)的发展和应用是一个非常广泛的领域,涉及从早期的统计模型到现代基于深度学…
论文目录~ 1.Debiasing Large Visual Language Models2.Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering3.Towards a Psychology of Machines: Large Language Models Predict Human Memory4.Can we obtain significant succ…
背景 我在本地windoes部署ChatGLM3-bB,且希望部署后能提供HTTP server的能力。 模型部署且启动是成功了,但是在访问生成接口/v1/chat/completions时报错failed to open nvrtc-builtins64_121.dll。 问题详细描述 找不到nvrtc-builtins64_121.dll Runtime…
AI视野今日CS.NLP 自然语言处理论文速览 Wed, 11 Oct 2023 Totally 81 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression Author…
AI视野今日CS.NLP 自然语言处理论文速览 Fri, 12 Jan 2024 Totally 60 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings Authors Hiroaki Yamagi…
GLM-130B:一个开源双语预训练语言模型《GLM-130B: An open bilingual pre-trained model》论文:https://arxiv.org/pdf/2210.02414.pdf 相关博客 【自然语言处理】【大模型】GLM-130B:一个开源双语预训练语言模型 【自然语言处理】【大模型】…
AI视野今日CS.NLP 自然语言处理论文速览 Mon, 15 Jan 2024 Totally 57 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Machine Translation Models are Zero-Shot Detectors of Translation Direction Authors Michelle Wastl, Ja…
AI视野今日CS.NLP 自然语言处理论文速览 Mon, 30 Oct 2023 Totally 67 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
An Approach to Automatically generating Riddles aiding Concept Attainment Authors Niharika Sri Parasa,…
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement 前言ABSTRACT1 INTRODUCTION2 THE OS-COPILOT FRAMEWORK2.1 PLANNER2.2 CONFIGURATOR2.2.1 DECLARATIVE MEMORY2.2.2 PROCEDURAL MEMORY2.2.3 WORKING MEMORY 2.3 ACTOR 3 THE FRIDAY AGENT3.1 A RUNNIN…
文章目录~ 1.Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey2.VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding3.MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Langu…
文章目录~ 1.AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent2.Training LLMs over Neurally Compressed Text3.Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph4.Visualization-of-Thought …
论文目录~ 1.Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards2.Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates3.Meta-Task Prompting Elicits Embedding from Lar…
此为观看视频How Large Language Model works的笔记。
GPT(Generative Pre-trained Transformer)是一个大语言模型(LLM),可以生成类似人类的文本。本文阐述:
什么是LLMLLM如何工作LLM的应用场景
什么是…
🍉 CSDN 叶庭云:https://yetingyun.blog.csdn.net/ 论文标题:Instruction Tuning for Large Language Models: A Survey
论文地址:https://arxiv.org/abs/2308.10792
指令调优是提升大语言模型(LLMs)性能…
🍉 CSDN 叶庭云:https://yetingyun.blog.csdn.net/ 论文标题:Long-form factuality in large language models
论文链接:https://arxiv.org/abs/2403.18802 论文的关键信息总结如下:
研究问题是什么?论文…
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 11 Jan 2024 Totally 36 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
Leveraging Print Debugging to Improve Code Generation in Large Language Models Authors Xueyu Hu, Kun K…
解决:ModuleNotFoundError: No module named ‘tiktoken’ 文章目录 解决:ModuleNotFoundError: No module named tiktoken背景报错问题报错翻译报错位置代码报错原因解决方法方法一,直接安装方法二,手动下载安装方法三࿰…
AI视野今日CS.NLP 自然语言处理论文速览 Thu, 7 Mar 2024 Totally 52 papers 👉上期速览✈更多精彩请移步主页 Daily Computation and Language Papers
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models Authors Adith…
本文说明以下重要结论 设模型参数量为 N N N,训练数据量(Token)为 D D D,LLM训练中计算量(FLOPs) C ≈ 6 N D C\approx 6ND C≈6ND 参考: 模型训练计算量到底怎么算分析transformer模型的参数…