Thinkwee's Blog

Deep Data Research -- Database as Hunting Ground, LLMs as Hunters

Posted on 2025-11-30 Edited on 2026-02-05 In LLM Views: Word count in article: 4.9k Reading time ≈ 4 mins.

Introducing DDR-Bench. Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models.

(Welcome) to the Era of Wild

Posted on 2025-10-05 Edited on 2025-10-08 In LLM Views: Word count in article: 11k Reading time ≈ 10 mins.

Connecting the dots, (welcome) to the era of wild.

Towards General Reasoning

Posted on 2025-09-13 Edited on 2025-10-07 In LLM Views: Word count in article: 26k Reading time ≈ 23 mins.

How far have we gone towards general reasoning? How far are we from the general reasoning?

Scaling the Environment

Posted on 2025-07-17 Edited on 2025-10-07 In LLM Views: Word count in article: 11k Reading time ≈ 10 mins.

What I Talk About When I Talk About Scaling the Environment?

What is the Next Step for Scaling in the Era of RL for LLM?

Posted on 2025-07-15 Edited on 2025-07-16 In LLM Views: Word count in article: 14k Reading time ≈ 13 mins.

When the redundant designs we added in the pre-LLM era have been deleted by the bitter lesson, we are ready to scale up. In the era of RL for LLM, what should be the next scaling up?

LLM Reasoning in 2025

Posted on 2025-05-30 Edited on 2025-07-16 In MyQuestion Views: Word count in article: 1.6k Reading time ≈ 1 mins.

What LLM Reasoning be like in the first half of 2025.

[Some Questions asking Myself 2025.5]

Posted on 2025-05-21 Edited on 2025-07-16 In MyQuestion Views: Word count in article: 17k Reading time ≈ 15 mins.

The second post on my "some very-personal questions to myself" series. It's been over a year since last post and many progress on LLM have been made from academic/industry, which partially solves my questions. I will introduce these works and ask myself some new questions. This post is about Pretrain Ceiling, Second Half, Scaling the Environment.

[Some Questions asking Myself 2024.4]

Posted on 2024-04-23 Edited on 2025-07-16 In MyQuestion Views: Word count in article: 8.9k Reading time ≈ 8 mins.

Some very-personal questions, assumptions and predictions on the future after the large model era. I hope to keep it a habit for writing such future-ask post for every half year to keep me thinking about the "next token" in the AI era. This post is about Compression, World Model, Agent and Alignment.

Multi-agent Reinforcement Learning Notes

Posted on 2023-07-20 Edited on 2025-07-16 In RL Views: Word count in article: 21k Reading time ≈ 20 mins.

A simple note on the RL used in single-agent and multi-agent.

Debates between GPTs

Posted on 2023-06-05 Edited on 2025-07-16 In NLP Views: Word count in article: 2.7k Reading time ≈ 2 mins.

A webpage based on ChatGPT-Shortcut that shows some interesting debates that took place between GPTs.
The experience website is here

Prompt - Task Reformulation in NLP

Posted on 2021-05-13 Edited on 2025-07-16 In NLP Views: Word count in article: 24k Reading time ≈ 21 mins.

Record of recent task reconstruction methods based on templates, a particularly interesting direction since the appearance of GPT-3. These methods typically design prompts for tasks, converting samples and tasks into natural language templates, which are then directly input into pre-trained language models to generate text, thereby indirectly completing the tasks. The construction of prompts standardizes the form of downstream tasks and pre-trained tasks (language models), achieving good results in few-shot learning. Key papers to read include the following nine:
- Early work that converts questions into natural language and uses pre-trained language models for answers:
  - (Harvard) Commonsense Knowledge Mining from Pretrained Models
  - (Heidelberg) Argumentative Relation Classification as Plausibility Ranking
  - (NVIDIA) Zero-shot Text Classification With Generative Language Models
- The PET approach, Pattern Exploiting Training:
  - (LMU) Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
  - (LMU) It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
  - (UNC) Improving and Simplifying Pattern Exploiting Training
- Automatically constructing prompts, Automatically Searching Prompts:
  - (UCI, UCB) AUTOPROMPT: Eliciting Knowledge from Language Models with Automatically Generated Prompts
  - (Princeton, MIT) Making Pre-trained Language Models Better Few-shot Learners
  - (THU) GPT Understands, Too
    Read more »

Edit-based Text Generation

Posted on 2021-05-11 Edited on 2025-07-16 In NLP Views: Word count in article: 27k Reading time ≈ 24 mins.

Record the methods of editing seq2seq in recent years, which have the advantages of high efficiency (partially autoregressive or non-autoregressive decoding) and less data hungry (small output vocabulary) for tasks with the same language input and output and minor changes (error correction, simplification, summarization).
Mainly read five papers, sorted by their publication date on arXiv:
- (LevT, Facebook) Levenshtein Transformer
- (Huawei) EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing
- (LaserTagger, Google) Encode, Tag, Realize: High-Precision Text Editing
- (PIE) Parallel Iterative Edit Models for Local Sequence Transduction
- (Google) Felix: Flexible Text Editing Through Tagging and Insertion

Note for VC Dimension

Posted on 2020-05-15 Edited on 2025-07-16 In ML Views: Word count in article: 16k Reading time ≈ 15 mins.

A brief review of the VC dimension. All discussions are based on the simple case of binary classification.

Notes for NLP with Graph-Structured Representations

Posted on 2020-04-05 Edited on 2025-07-16 In ML Views: Word count in article: 14k Reading time ≈ 13 mins.

Read Dr. Bang Liu’s paper Natural Language Processing and Text Mining with Graph-Structured Representations from the University of Alberta and take some notes.

Study Notes for CS224w

Posted on 2020-03-30 Edited on 2025-07-16 In ML Views: Word count in article: 15k Reading time ≈ 14 mins.

Study notes for Stanford CS224W: Machine Learning with Graphs by Jure Leskovec.

CLSciSumm summary

Posted on 2020-03-27 Edited on 2025-07-16 In NLP Views: Word count in article: 7.6k Reading time ≈ 7 mins.

A brief note on the CLSciSumm Workshop that the CIST lab participated in, the main focus is on methods. The experiments are analysised in detail in papers. Papers:

Incremental Decoding

Posted on 2020-03-17 Edited on 2025-07-16 In NLP Views: Word count in article: 16k Reading time ≈ 15 mins.

Record the incremental decoding processing of parallel decoding models such as CNN seq2seq and Transformer in the inference phase in Fairseq.

BERTology

Posted on 2020-03-02 Edited on 2025-07-16 In NLP Views: Word count in article: 8.5k Reading time ≈ 8 mins.

note for A Primer in BERTology: What we know about how BERT works

Structured Neural Summarization, Paper Reading

Posted on 2020-02-28 Edited on 2025-07-16 In NLP Views: Word count in article: 12k Reading time ≈ 11 mins.

reading note for STRUCTURED NEURAL SUMMARIZATION.

SVM

Posted on 2020-02-13 Edited on 2025-07-16 In NLP Views: Word count in article: 13k Reading time ≈ 12 mins.

Long time no see, SVM.

Reformer - Paper Reading

Posted on 2020-02-07 Edited on 2025-07-16 In NLP Views: Word count in article: 9k Reading time ≈ 8 mins.

Reading note for reformer.

Paper Reading 4

Posted on 2019-12-16 Edited on 2025-07-16 In NLP Views: Word count in article: 14k Reading time ≈ 12 mins.

Paper reading on

GNN Pooling
Discourse-Aware Summarization
Siamese BERT
Large Chatbot

Note for Hierarchical Latent Dirichlet Allocation

Posted on 2019-11-15 Edited on 2025-07-16 In ML Views: Word count in article: 24k Reading time ≈ 21 mins.

Note for Hierarchical Latent Dirichlet Allocation

Paper reading on Knowledge Graphs

Posted on 2019-11-13 Edited on 2025-07-16 In NLP Views: Word count in article: 35k Reading time ≈ 32 mins.

Knowledge Graph Special Collection
- Entity Alignment in Cross-lingual Knowledge Graphs
- Knowledge Graph Language Model
- Dynamic Knowledge Graph Dialogue Generation
- Graph2Seq
- Graph Matching Network
- Dynamic Knowledge Graph Update
- Attention-based Embeddings for Relation Prediction