Technology

Multi-Agent Reinforcement Learning | 多智能体强化学习中的信息设计

Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang. NeurIPS 2023. This is currently my most representative work.

Mar 15, 2024 Artificial Intelligence, Multi-Agent Reinforcement Learning

Multi-Agent Reinforcement Learning | Information Design in Multi-Agent Reinforcement Learning

Yue Lin, Wenhao Li, Hongyuan Zha, Baoxiang Wang. NeurIPS 2023. This is currently my most representative work.

Mar 14, 2024 Artificial Intelligence, Multi-Agent Reinforcement Learning

Economics & Game Theory | Information Design in 10 Minutes

A brief introduction from the perspective of BCE (Bayes correlated equilibrium), with some examples.

Aug 10, 2023 Interdisciplinarity, Economics & Game Theory

Economics & Game Theory | Information Design

A sender with informational advantage wants to send messages to steer a receiver's action policy. They may have different objects. Information design is to optimize the sender's signaling schemes.

Jun 1, 2023 Interdisciplinarity, Economics & Game Theory

Multi-Agent Reinforcement Learning | PSRO: Policy-Space Response Oracles

Lanctot, Marc, et al. "A unified game-theoretic approach to multiagent reinforcement learning." Advances in neural information processing systems 30 (2017).

Mar 8, 2025 Artificial Intelligence, Multi-Agent Reinforcement Learning

Algorithm | A* in 10 Minutes

Resources Red Blob Games Dijkstra - Bilibili Zhihu https://zhuanlan.zhihu.com/p/13185307595 https://zhuanlan.zhihu.com/p/595716772 Key Idea A* = BFS + priority...

Mar 6, 2025 Mathematics, Algorithm

Misc Toolbox | Obsidian-Based Workspace

前言 Github Repo: draft01_empty 用了很久的电脑，要记点东西要么就是随手记，要么是找半天分类再在专门的文件夹里创建md再记录但是前者不好管理，记录的时候又担心之后找不到了，记录的欲望就低了后者是启动慢，有时候灵感就丢了或者来不及记了，记起来也很痛苦。本来用博客记，但是很多东西不好公开，而且预览也不直接本来觉得时间小事但是长期下来发现...

Jan 17, 2025 Efficiency, Misc Toolbox

Economics & Game Theory | Cheap Talk

To give a formal definition, cheap talk is communication that is: costless to transmit and receive non-binding (i.e. does not limit strategic choices by either party) unverifiable (i...

Dec 22, 2024 Interdisciplinarity, Economics & Game Theory

Code Utils | Llama Memo

Sources Ollama. It is easy to use. Hugging Face. It provides tokenizer in Python API. Ollama Installation Download the app on the website. Install the app. Run ollama run llama3 in...

Jul 16, 2024 Efficiency, Code Utils

Machine Learning Basics | GPT

The following part has not been finished yet. GPT-1 Resources Radford, Alec, et al. (OpenAI) “Improving language understanding by generative pre-training.” (2018). GPT-1 = Decoder (in Tran...

Jun 10, 2024 Artificial Intelligence, Machine Learning Basics

Machine Learning Basics | Transformer

Encoder-Decoder Variable-length inputs Truncation and Padding Dive Into Deep Learning 10.5.3 Relation Network A blog ICLR 2017 Embedding. Enco...

Jun 5, 2024 Artificial Intelligence, Machine Learning Basics

Economics & Game Theory | Bargaining

An Extensive-Form Game Model Ultimatum Two people use the following procedure to split $1$ object: Player 1 offers Player 2 some amount $0 \leq x \leq 1$ If Player 2 accepts the outcome ...

May 17, 2024 Interdisciplinarity, Economics & Game Theory

Economics & Game Theory | Extensive-Form Games and Subgame Perfect Equilibrium

Resources This post uses the material from the following works: MIT 6.254 2010: Game Theory With Engineering Applications MIT 14.12 2012: Economic Applications Of Game Theory Tadelis, Steve...

May 16, 2024 Interdisciplinarity, Economics & Game Theory

Math Miscs | Mathematica Memos

听说python里用sympy也能做一些推导和化简，之后去看看；mathematica占硬盘太多地了基础 $\epsilon$这种输入是Epsilon，首字母大写 Enter是换行，Shift + Enter是执行区分大小写，大小写不同的量是两个量函数调用，参数用中括号框起来表达式结尾加分号;能让这个表达式的结果不输出 *是逐元素乘法，句号.是线性代数的...

Apr 27, 2024 Mathematics, Math Miscs

Misc Toolbox | Building My Own PC

资源装机：【【装机教程】全网最好的装机教程，没有之一】兼容性：【【收藏血赚】DIY电脑前必须要知道的事！手把手教你检查电脑装机配置单中各硬件兼容性问题！DIY电脑中各硬件兼容性检查指南！新手小白装机前必读！】装系统：【【装机教程】超详细WIN10系统安装教程，官方ISO直装与PE两种方法教程，UEFI+GUID分区与Legacy+MBR分区】 CPU...

Apr 26, 2024 Efficiency, Misc Toolbox

Reinforcement Learning | TRPO Details

The origin paper: Schulman, John, et al. “Trust region policy optimization.” International conference on machine learning. PMLR, 2015. Overview This derivation comes from the Appendix A.1 ...

Apr 24, 2024 Artificial Intelligence, Reinforcement Learning

Economics & Game Theory | Fairness Versus Reason in the Ultimatum Game

The Game The experimenter assigns a certain sum, and the Proposer can offer a share of it to the Responder. If the Responder (who knows the sum) accepts, the sum is split accordingly between the...

Apr 22, 2024 Interdisciplinarity, Economics & Game Theory

Economics & Game Theory | Evolutionary Game Theory

Basic Symmetric Model with Stochastic Strategies This section is a summary of Chapter 29 of the book “Algorithmic Game Theory”1. Agents (organisms) The number of agents is infini...

Apr 20, 2024 Interdisciplinarity, Economics & Game Theory

Code Utils | Code Visualization

Function Call Graph Not working: pyan3 pycallgraph pycallgraph2 Inheritance Visualization Example 1: See my blog. Example 2: pyreverse -o png -p outputed_diagram main.py Agent.p...

Apr 7, 2024 Efficiency, Code Utils

Code Utils | LyPythonToolbox

Resources Github Repo My Full Code Toolbox Install Install: pip install LyPythonToolbox Update: pip install --upgrade LyPythonToolbox Print Tricks lyprint_separator from LyPythonToolb...

Apr 5, 2024 Efficiency, Code Utils

Code Utils | Github Memo

Create a Repo Click the green button New on the GitHub repo website. Do not check the Add a README file. Copy the link with the .git extension. Create a directory locally and enter it in a...

Apr 2, 2024 Efficiency, Code Utils

Code Utils | Python Project Template

How to Use Download LyPythonProjectTemplate Decompress it. Create a new Github project. See my blog. Copy the contents of LyPythonProjectTemplate into the root folder of your new project....

Mar 31, 2024 Efficiency, Code Utils

Misc Toolbox | My Website

Jekyll 读作”街口” 【转载 - Jekyll - 静态网站生成器教程双语字幕】迁移和部署迁移比如要换电脑，那么这个博客怎么重新装起来复制一份这个文件安装环境 ruby and jekyll: https://jekyllrb.com/docs/installation/macos/ bundle install...

Mar 10, 2024 Efficiency, Misc Toolbox

Misc Toolbox | MacOS Workspace

Desktop Wallpaper: A Seascape, Shipping by Moonlight - Monet False Knees - Joshua Basic Tools Hidden Bar Wins and Magnet Window Arrangement Wins: https://wins.cool/html/index.html Reco...

Mar 2, 2024 Efficiency, Misc Toolbox

Code Utils | PyTorch Toolbox

Nets Linear / MLP PyTorch Document - Linear Initialization Parameters in_features out_features bias=Ture input.shape: (*, in_features) output.shape: (*, out_...

Mar 1, 2024 Efficiency, Code Utils

Code Utils | Python Toolbox

This post was completed with the assistance of ChatGPT-4. Inheritance Inheritance allows a class (known as a child class) to inherit attributes and methods from another class (known as a parent ...

Dec 28, 2023 Efficiency, Code Utils

Reinforcement Learning | Stable Baseline 3

Getting Started Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch. — Stable Baseline3 Docs. Resources [Stable Baseline3 Docs] [...

Dec 28, 2023 Artificial Intelligence, Reinforcement Learning

Multi-Agent Reinforcement Learning | Overcooked: A MARL Task

A Brief Intro This MARL environment remade the Overcooked game on Steam. Some features (game rules) are cut to simplify the situation. This game requires two agents to coordinate to cook. If they...

Dec 28, 2023 Artificial Intelligence, Multi-Agent Reinforcement Learning

Misc Toolbox | Tools of Visual Studio Code

This note will be consistently updated. Shortcuts Command + k Command + s: Keyboard Shortcuts GOTO Command + P: Go to file Command + Shift + O: Go to symbol in editor Command + T: Go to...

Dec 24, 2023 Efficiency, Misc Toolbox

Machine Learning Basics | HyperNetworks

Introduction [Paper]: HyperNetworks The following part has not been finished yet. Application in QMIX Illustration from the corresponding paper. The following statements from the paper are...

Nov 13, 2023 Artificial Intelligence, Machine Learning Basics

Machine Learning Basics | Decision Transformers

Decision Transformer Paper: Decision Transformer: Reinforcement Learning via Sequence Modeling - NeurIPS 2021 [Website] [Code] Illustration from the corresponding paper. Illustration fro...

Nov 11, 2023 Artificial Intelligence, Machine Learning Basics

Math Miscs | Contraction Mapping Theorem

Metric Space Definition of metric space Definition. A metric space is an ordered pair $(M, d)$ where $M$ is a set and $d$ is a metric on $M$, i.e., a function $d: M\times M \to \mathbb{R}$ sa...

Oct 19, 2023 Mathematics, Math Miscs

Math Miscs | A Note on Stochastic Processes

This note partially uses the materials from the notes of MATH2750. Transition Matrix The transition kernel $\mathbf{M}$ is a square matrix of size $\vert S\vert \times \vert S\vert$. $\math...

Sep 3, 2023 Mathematics, Math Miscs

Economics & Game Theory | Zero-Determinant Strategy

This note aims to explain the parts omitted in this paper: Press, William H., and Freeman J. Dyson. “Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent.” Procee...

Aug 29, 2023 Interdisciplinarity, Economics & Game Theory

Economics & Game Theory | Classic Games

This note will be consistently updated. Prisoner’s Dilemma Two members of a criminal organization are arrested and imprisoned. Each prisoner is in solitary confinement with no means of commun...

Aug 13, 2023 Interdisciplinarity, Economics & Game Theory

Economics & Game Theory | A Memo on Game Theory

This note will be consistently updated. Rationality A rational player is one who chooses his action, to maximize his payoff consistent with his beliefs about what is going on in the game.1 ...

Aug 10, 2023 Interdisciplinarity, Economics & Game Theory

Multi-Agent Reinforcement Learning | Fictitious Self-Play and Zero-Shot Coordination

Fictitious Play Fictitious play is a learning rule. In it, each player presumes that the opponents are playing stationary (possibly mixed) strategies. At each round, each player thus best ...

Jul 31, 2023 Artificial Intelligence, Multi-Agent Reinforcement Learning

Reinforcement Learning | Policy Gradient Details

The only way to make sense out of change is to plunge into it, move with it, and join the dance. — Alan Watts. Bellman Equations [V(s_t) = \mathbb{E}\left[ r_t + \gamma\cdot V(s_{t+1}) \right...

Jul 24, 2023 Artificial Intelligence, Reinforcement Learning

Machine Learning Basics | RNNs

NLP Terms NLP = Natural Language Processing Embedding In a general sense, “embedding” refers to the process of representing one kind of object or data in another space or format. It involves m...

Jul 15, 2023 Artificial Intelligence, Machine Learning Basics

Multi-Agent Reinforcement Learning | MARL Basics

This note has not been finished yet. Markov Models MDP Markov decision process $(S, A, \mathcal{P}, R, \gamma)$ Single-agent, fully observable, and dynamic. P...

Jun 29, 2023 Artificial Intelligence, Multi-Agent Reinforcement Learning

Code Utils | Computation Graph Visualization

PyTorchviz Basics Install brew install graphviz (or here) pip install torchviz Documentation: Github Official examples: Colab If a node represents a backward fu...

Jun 24, 2023 Efficiency, Code Utils

Math Miscs | Dynamic Epistemic Logic

Three logicians walk into a bar. The bartender asks: “Do you all want a drink?” The first logician says: “I don’t know.” The second logician says: “I don’t know.” The third logician says: “Yes.” ...

Jun 22, 2023 Mathematics, Math Miscs

Multi-Agent Reinforcement Learning | Theory of Mind and Markov Models

We do not see things as they are, we see them as we are. — Anaïs Nin. What is Theory of Mind? In psychology, theory of mind refers to the capacity to understand other people by ascribing menta...

Jun 19, 2023 Artificial Intelligence, Multi-Agent Reinforcement Learning

Reinforcement Learning | RL Toolbox

This note will be consistently updated. PPO Tricks There are a total of 37 tricks, among which 13 are relatively core. PPO paper The 37 Implementation Details of Proximal Policy O...

Apr 10, 2023 Artificial Intelligence, Reinforcement Learning

Code Utils | Misc Code Toolbox

This note will be consistently updated. Tmux 太久没连服务器连这个怎么用都快忘了…不要想太复杂的操作，我用这个的原因就只有两个，第一个原因是用这个在服务器上运行python文件后，我再断开服务器的连接，这个还能在后台跑；第二个原因是，可以只用ssh连服务器一次就可以用tmux来用多个shell，比如同时跑两个python文件，这个应噶就是终...

Apr 9, 2023 Efficiency, Code Utils

Math Miscs | Math Toolbox

This note will be consistently updated. Optimization Basics The standard form for an optimization problem (the primal problem) is the following: [\begin{aligned} &\min\limits_{x} \quad f_...

Apr 7, 2023 Mathematics, Math Miscs

Robotics | Swinging Search and Crawling Control

Yue Lin, et al. A snake-inspired path planning algorithm based on reinforcement learning and self-motion for hyper-redundant manipulators. International Journal of Advanced Robotic Systems 2022.

Apr 3, 2023 Interdisciplinarity, Robotics

Robotics | RHex-T3: A Mobile Robot, with Hybrid Leg Design

Please be aware that the videos accompanying this article may take some time to load, depending on the speed of your internet connection to GitHub. Innovative design and simulation of a transform...