Thinkwee's Blog

Too Stupid to Give Up Learning

How far have we gone towards general reasoning? How far are we from the general reasoning?

Read more »

What I Talk About When I Talk About Scaling the Environment?

Read more »

The second post on my "some very-personal questions to myself" series. It's been over a year since last post and many progress on LLM have been made from academic/industry, which partially solves my questions. I will introduce these works and ask myself some new questions. This post is about Pretrain Ceiling, Second Half, Scaling the Environment.

Read more »

Some very-personal questions, assumptions and predictions on the future after the large model era. I hope to keep it a habit for writing such future-ask post for every half year to keep me thinking about the "next token" in the AI era. This post is about Compression, World Model, Agent and Alignment.

Read more »

  • Record of recent task reconstruction methods based on templates, a particularly interesting direction since the appearance of GPT-3. These methods typically design prompts for tasks, converting samples and tasks into natural language templates, which are then directly input into pre-trained language models to generate text, thereby indirectly completing the tasks. The construction of prompts standardizes the form of downstream tasks and pre-trained tasks (language models), achieving good results in few-shot learning. Key papers to read include the following nine:
    • Early work that converts questions into natural language and uses pre-trained language models for answers:
      • (Harvard) Commonsense Knowledge Mining from Pretrained Models
      • (Heidelberg) Argumentative Relation Classification as Plausibility Ranking
      • (NVIDIA) Zero-shot Text Classification With Generative Language Models
    • The PET approach, Pattern Exploiting Training:
      • (LMU) Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
      • (LMU) It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
      • (UNC) Improving and Simplifying Pattern Exploiting Training
    • Automatically constructing prompts, Automatically Searching Prompts:
      • (UCI, UCB) AUTOPROMPT: Eliciting Knowledge from Language Models with Automatically Generated Prompts
      • (Princeton, MIT) Making Pre-trained Language Models Better Few-shot Learners
      • (THU) GPT Understands, Too
        Read more »

  • Record the methods of editing seq2seq in recent years, which have the advantages of high efficiency (partially autoregressive or non-autoregressive decoding) and less data hungry (small output vocabulary) for tasks with the same language input and output and minor changes (error correction, simplification, summarization).
  • Mainly read five papers, sorted by their publication date on arXiv:
    • (LevT, Facebook) Levenshtein Transformer
    • (Huawei) EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing
    • (LaserTagger, Google) Encode, Tag, Realize: High-Precision Text Editing
    • (PIE) Parallel Iterative Edit Models for Local Sequence Transduction
    • (Google) Felix: Flexible Text Editing Through Tagging and Insertion
Read more »

A brief review of the VC dimension. All discussions are based on the simple case of binary classification.

Read more »

Read Dr. Bang Liu’s paper Natural Language Processing and Text Mining with Graph-Structured Representations from the University of Alberta and take some notes.

Read more »

Study notes for Stanford CS224W: Machine Learning with Graphs by Jure Leskovec.

Read more »

A brief note on the CLSciSumm Workshop that the CIST lab participated in, the main focus is on methods. The experiments are analysised in detail in papers. Papers:

Read more »

Record the incremental decoding processing of parallel decoding models such as CNN seq2seq and Transformer in the inference phase in Fairseq.

Read more »

Long time no see, SVM.

Read more »

Paper reading on

  • GNN Pooling
  • Discourse-Aware Summarization
  • Siamese BERT
  • Large Chatbot
Read more »

  • Knowledge Graph Special Collection
    • Entity Alignment in Cross-lingual Knowledge Graphs
    • Knowledge Graph Language Model
    • Dynamic Knowledge Graph Dialogue Generation
    • Graph2Seq
    • Graph Matching Network
    • Dynamic Knowledge Graph Update
    • Attention-based Embeddings for Relation Prediction
Read more »

Record some recent processing of heterogeneous information networks

  • PathSim
  • HGNN
  • HGAN
  • HGAN for text classification
  • Attribute, Attributed Multiplex Heterogeneous Network
  • Meta-graph Guided Random Walks
Read more »

Graph-based Automatic Summary Related Paper Selection Reading

  • AMR Generative Summary
  • AMR Multi-document Summarization Two Papers
  • pagerank in encoder attention
  • Build a graph based on thematic modeling, use ILP for extractive summarization
  • Multi-document Extractive Summary Based on GCN
  • STRUCTURED NEURAL SUMMARIZATION
Read more »

rl study note, minimalist style

  • Q-learning
  • Sarsa
  • Sarsa(\(\lambda\))
  • DQN
  • Double DQN
  • DQN with Prioritized Experience replay
  • Dueling DQN
  • Policy Gradient
Read more »

Selected Reading of ACL/NAACL 2019 Automatic Summarization Papers

  • DPPs Similarity Measurement Improvement

  • STRASS: Backpropagation for Extractive Summarization

  • Translate first, then generate the abstract

  • Reading Comprehension + Automatic Abstract

  • BiSET: Retrieve + Fast Rerank + Selective Encoding + Template Based

Read more »

Selected readings from ACL 2019 award-winning papers.

  • Using Oracle for sentence-level teacher forcing
  • speaker commitment
  • A set of evaluation index frameworks applicable to abstracts, combining multiple indicators
  • Zero-Shot Entity Linking
Read more »

Read more »
0%