Thinkwee's Blog

0%

What's the next step to Large Models?

Posted on 2024-04-23 Edited on 2024-05-11 In NLP Views: Valine:
Symbols count in article: 4.1k Reading time ≈ 4 mins.

Some very-personal questions, assumptions and predictions on the future after the large model era. I hope to keep it a habit for writing such future-ask post for every half year or one year to keep me thinking about the "next token" in the AI era. Still in draft.

Multi-agent Reinforcement Learning Notes

Posted on 2023-07-20 Edited on 2024-04-23 In RL Views: Valine:
Symbols count in article: 6.2k Reading time ≈ 6 mins.

A simple note on the RL used in single-agent and multi-agent.

Debates between GPTs

Posted on 2023-06-05 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 2.8k Reading time ≈ 3 mins.

基于ChatGPT-Shortcut改了一个网页，展示了一些GPT自己与自己发生的有趣辩论。
体验地址在这
A webpage based on ChatGPT-Shortcut that shows some interesting debates that took place between GPTs.
The experience website is here

Prompt - Task Reformulation in NLP

Posted on 2021-05-13 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 7.6k Reading time ≈ 7 mins.

记录近年基于模板来完成任务重构的方法，这是一个比较有意思的方向，尤其是GPT3出现之后。这类方法一般针对任务设计prompt，将样本和任务一起转换为自然语言形式的template，直接输入预训练语言模型预测出文本，间接的完成任务。prompt的构建一方面统一了下游任务和预训练任务的形式（语言模型）在few shot learning上能取得较好结果。主要阅读以下9篇论文：
- 早期的将问题转为自然语言并使用预训练语言模型解答的：
  - (Harvard)Commonsense Knowledge Mining from Pretrained Models
  - (Heidelberg)Argumentative Relation Classification as Plausibility Ranking
  - (NVIDIA)Zero-shot Text Classification With Generative Language Models
- PET方向，Pattern Exploiting Training
  - (LMU)Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
  - (LMU)It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
  - (UNC)Improving and Simplifying Pattern Exploiting Training
- 自动构建prompt，Automatically Searching Prompts
  - (UCI,UCB)AUTOPROMPT: Eliciting Knowledge from Language Models with Automatically Generated Prompts
  - (Princeton, MIT)Making Pre-trained Language Models Better Few-shot Learners
  - (THU)GPT Understands, Too

Edit-based Text Generation

Posted on 2021-05-11 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 7.6k Reading time ≈ 7 mins.

记录近年来关于编辑式seq2seq的方法，这类方法对于输入输出同语种且较小更改的任务（纠错、简化、摘要）有着高效率（部分自回归或非自回归解码）和less data hungry（输出词表小）的优势。
主要阅读五篇论文，按照其在arxiv上发表时间排序：
- (LevT, Facebook) Levenshtein Transformer
- (华为) EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing
- (LaserTagger, Google) Encode, Tag, Realize: High-Precision Text Editing
- (PIE，印度理工) Parallel Iterative Edit Models for Local Sequence Transduction
- (Google) Felix: Flexible Text Editing Through Tagging and Insertion

Note for VC Dimension

Posted on 2020-05-15 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 5.1k Reading time ≈ 5 mins.

简单的梳理VC维。所有讨论基于二分类这一简单情况出发。

Notes for NLP with Graph-Structured Representations

Posted on 2020-04-05 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 5.7k Reading time ≈ 5 mins.

阅读来自University of Alberta 的Bang Liu博士的论文Natural Language Processing and Text Mining with Graph-Structured Representations，做一些笔记

Study Notes for CS224w

Posted on 2020-03-30 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 9.3k Reading time ≈ 8 mins.

Jure Leskovec, Stanford CS224W: Machine Learning with Graphs学习笔记，未完待续

CLSciSumm summary

Posted on 2020-03-27 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 2.3k Reading time ≈ 2 mins.

总结一下实验室参加的CLSciSumm Workshop，主要是方法，实验分析论文里很详细论文：

Incremental Decoding

Posted on 2020-03-17 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 7k Reading time ≈ 6 mins.

记录一下Fairseq当中对于CNN seq2seq，Transformer之类的并行解码模型，在推理阶段的增量解码处理。

BERTology

Posted on 2020-03-02 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 3.1k Reading time ≈ 3 mins.

翻译A Primer in BERTology: What we know about how BERT works一文，系统的介绍了近年来对于BERT可解释性以及扩展性方面的研究。原论文arxiv pdf

Structured Neural Summarization, Paper Reading

Posted on 2020-02-28 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 4.6k Reading time ≈ 4 mins.

STRUCTURED NEURAL SUMMARIZATION 阅读笔记

SVM

Posted on 2020-02-13 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 5.1k Reading time ≈ 5 mins.

一些推导关键步骤

Reformer - Paper Reading

Posted on 2020-02-07 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 2.9k Reading time ≈ 3 mins.

Reformer论文解读

Paper Reading 4

Posted on 2019-12-16 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 4.9k Reading time ≈ 4 mins.

跨年读论文。

边池化
Discourse-Aware，抽取式摘要
Discourse-Aware，生成式摘要
孪生BERT
巨型Chatbot

Note for Hierarchical Latent Dirichlet Allocation

Posted on 2019-11-15 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 8.8k Reading time ≈ 8 mins.

记录 Hierarchical Latent Dirichlet Allocation，层次主题模型的学习笔记。依然大量参考了徐亦达老师的教程。

Paper reading on Knowledge Graphs

Posted on 2019-11-13 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 13k Reading time ≈ 12 mins.

知识图谱专辑

跨语言知识图谱中的实体对齐
Knowledge Graph Language Model
动态知识图谱对话生成
Graph2Seq
Graph Matching Network
动态更新知识图谱
Attention-based Embeddings for Relation Prediction

Note for Heterogeneous Information Network

Posted on 2019-10-30 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 4.7k Reading time ≈ 4 mins.

记录近年来对于异构信息网络的一些处理

PathSim
HGNN
HGAN
HGAN for text classification
带属性，Attributed Multiplex Heterogeneous Network
Meta-graph Guided Random Walks
TBD

Note for Graph-based Summarization

Posted on 2019-10-03 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 7.4k Reading time ≈ 7 mins.

基于图的自动摘要相关论文选读

AMR 生成式摘要
AMR 多文档摘要两篇
pagerank in encoder attention
基于主题建模构建图，使用ILP做抽取式摘要
基于GCN的多文档抽取式摘要
STRUCTURED NEURAL SUMMARIZATION

Read more »

Easy Reinforcement Learning Notes

Posted on 2019-09-23 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 4.8k Reading time ≈ 4 mins.

极简风格

Q-learning
Sarsa
Sarsa(\(\lambda\))
DQN
Double DQN
DQN with Prioritized Experience replay
Dueling DQN
Policy Gradient

Summarization-Related Papers Reading (ACL/NAACL 2019)

Posted on 2019-08-15 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 7k Reading time ≈ 6 mins.

ACL/NAACL 2019 自动摘要相关论文选读

DPPs 相似度度量改进
STRASS：抽取式摘要的反向传播
先翻译再生成摘要
阅读理解+自动摘要
BiSET：Retrieve + Fast Rerank + Selective Encoding + Template Based

Read more »

Study Notes for Cognitive Graph

Posted on 2019-08-13 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 5.1k Reading time ≈ 5 mins.

今天阅读一篇来自清华和阿里巴巴团队的关于机器阅读理解方面的论文，Cognitive Graph for Multi-Hop Reading Comprehension at Scale。这篇论文同样中了ACL2019，但我没有将其放进ACL2019论文阅读的博文里，因为感觉这篇值得专门讲讲，虽然没有拿到优秀论文或者杰出论文，甚至提名都没有，但这篇论文的思路、方法论都非常好，用一种最简单的方式实现联结主义+知识推理。

Interview Summary for NLP

Posted on 2019-08-09 Edited on 2024-04-23 In Other Views: Valine:
Symbols count in article: 1k Reading time ≈ 1 mins.

总结一下六月份的面试经验

Study Notes for Correlation Explaination

Posted on 2019-07-29 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 8.6k Reading time ≈ 8 mins.

CorEx(Correlation Explaination)的相关笔记。

Outstanding Papers Reading (ACL 2019)

Posted on 2019-07-28 Edited on 2024-04-23 In NLP Views: Valine:
Symbols count in article: 8.4k Reading time ≈ 8 mins.

ACL 2019获奖论文选读。

利用oracle来做句子级别的teacher forcing
speaker commitment
适用于摘要的一套评价指标框架，结合了多个指标
Zero-Shot Entity Linking

Note for Variational Auto-Encoder

Posted on 2019-03-20 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 5k Reading time ≈ 5 mins.

变分自编码器学习笔记
参考文章：
关于VAE，上面的原论文以及两篇博客已经讲的很清楚了，我写也就是复读转述，自己捋一遍，如果有人看到这篇博客，建议优先读这三个参考来源

Read more »

Glove Embedding - Mathematical Derivation

Posted on 2019-01-13 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 3.7k Reading time ≈ 3 mins.

记录一下Glove词向量的数学推导，因为原论文不是画模型得出的，而是纯数学操作计算得到的目标函数，这种设计方式非常有意思，而且还将word2vec的数学本质写出来进行了对比。
原论文：GloVe: Global Vectors for Word Representation

Paper Reading 3

Posted on 2019-01-03 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 3.7k Reading time ≈ 3 mins.

卷积序列到序列
鲁棒的无监督跨语言词嵌入映射

Read more »

Notes for Computational Linguistics

Posted on 2018-11-16 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 37k Reading time ≈ 33 mins.

计算语言学课程笔记参考教材：Speech and Language Processing：An Introduction to Natural Language Processing,Computational Linguistics, and Speech Recognition 一些公式待修订

Logistic Regression and Maximum Entropy

Posted on 2018-10-14 Edited on 2024-04-23 In ML Views: Valine:
Symbols count in article: 4k Reading time ≈ 4 mins.

翻译John Mount的The equivalence of logistic regression and maximum entropy models 一文，并说明了这种证明是在统计学习方法中介绍最大熵模型的通用导出证明的一个特例

结论

最大熵模型就是softmax分类
在满足广义线性模型的平衡条件下，满足最大熵条件的模型映射函数就是softmax函数
在统计机器学习方法一书中，给出了在特征函数定义下的最大熵模型，其与softmax回归都属于对数线性模型
当特征函数从二值函数扩展为特征值本身时，最大熵模型就化为softmax回归模型
最大熵最大化的是条件熵，不是条件概率的熵，也不是联合概率的熵。