Publications

Year of 2022

Emergent Abilities of Large Language Models
Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus
arXiV
PDF |

PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery*, Sharan Narang*, Jacob Devlin*, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
arXiV & Google AI Blog
PDF | Blog Post

Unifying Language Learning Paradigms
Yi Tay* , Mostafa Dehghani*, Vinh Q. Tran, Xavier Garcia, Dara Bahri, Tal Schuster, Huaixiu Steven Zheng, Neil Houlsby, Donald Metzler
arXiV
PDF | Tweet | Checkpoints

Transformer Memory as a Differentiable Search Index
Yi Tay , Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler
arXiV
PDF | Tweet |
Press: Yannic’s Youtube Paper Review | Yannic’s Author Interview | ZetaAlpha NeuralIR podcast |

Efficient Transformers: A Survey
Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler
ACM Computing Surveys 2022
PDF | Tweet

HyperPrompt: Prompt-based Task-Conditioning of Transformers
Yun He*, Huaixiu Steven Zheng*, Yi Tay, Jai Gupta, Yu Du, Vamsi Aribandi, Zhe Zhao, YaGuang Li, Zhao Chen, Donald Metzler, Heng-Tze Cheng, Ed H. Chi
ICML 2022
PDF |

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Vamsi Aribandi*, Yi Tay* , Tal Schuster, Jinfeng Rao, Huaixiu Steven Zheng, Sanket Vaibhav Mehta, Honglei Zhuang, Vinh Q. Tran, Dara Bahri, Jianmo Ni, Jai Gupta, Kai Hui, Sebastian Ruder, Donald Metzler
Proceedings of ICLR 2022
PDF | Tweet | Press: Yannic’s Youtube Channel

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay* , Mostafa Dehghani*, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, Donald Metzler
Proceedings of ICLR 2022
PDF | Checkpoints

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Yi Tay* , Vinh Q. Tran*, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler
Proceedings of ICLR 2022
PDF | Tweet | Code | Perspective API launch | Press: AI Coffee Break (Youtube)

The Efficiency Misnomer
Mostafa Dehghani*, Anurag Arnab*, Lucas Beyer*, Ashish Vaswani, Yi Tay*
Proceedings of ICLR 2022
PDF |

SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri, Heinrich Jiang, Yi Tay , Donald Metzler
Proceedings of ICLR 2022
PDF |

Sharpness-Aware Minimization Improves Language Model Generalization
Dara Bahri, Hossein Mohabi, Yi Tay
Proceedings of ACL 2022
PDF |

Improving Compositional Generalization with Self-Training for Data-to-Text Generation
Sanket Vaibhav Mehta, Jinfeng Rao, Yi Tay , Mihir Kale, Ankur Parikh, Hongtao Zhong, Emma Strubell
Proceedings of ACL 2022
PDF | Sanket’s Internship project

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, Jing Lu, Dara Bahri, Ji Ma, Jai Gupta, Cicero Nogueira dos Santos, Yi Tay, Donald Metzler
Proceedings of ACL 2022 (Findings)
PDF |

A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Lees*, Vinh Q. Tran*, Yi Tay*, Jeffrey Sorensen, Jai Gupta, Donald Metzler, Lucy Vasserman
KDD 2022 (Applied Data Science Track)
PDF | Jigsaw’s Medium Blog

Born Again Neural Rankers
Zhen Qin, Le Yan, Yi Tay , Honglei Zhuang, Xuanhui Wang, Michael Bendersky, Marc Najork
arXiV preprint
PDF |

Year of 2021

OmniNet: Omnidirectional Representations from Transformers
Yi Tay* , Mostafa Dehghani*, Vamsi Aribandi, Jai Gupta, Philip Pham, Zhen Qin, Dara Bahri, Da-Cheng Juan, Donald Metzler
Proceedings of ICML 2021 (Oral, Long Talk) ~top 3%
PDF |

Synthesizer: Rethinking Self-Attention in Transformer Models
Yi Tay, Dara Bahri, Donald Metzler, Da-Cheng Juan, Zhe Zhao, Che Zheng
Proceeedings of ICML 2021
PDF | Tweet
Model code is released in the Mesh Tensorflow Library
Press: Yannic Kilcher’s Channel | DL Reviews | Stanford CS224N

Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay* , Mostafa Dehghani*, Samira Abnar, Yikang Shen, Dara Bahri,
Phillip Pham, Jinfeng Rao, Liu Yang, Sebastian Ruder, Donald Metzler
Proceedings of ICLR 2021
PDF | Tweet Code is released at Google Research Github
Press: Huggingface Blogs | SyncedReview

HyperGrid - Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, Da-Cheng Juan
Proceedings of ICLR 2021
PDF |

Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?
Zhen Qin, Le Yan, Honglei Zhuang, Yi Tay, Rama Kumar Pasumarthi, Xuanhui Wang, Michael Bendersky, Marc Najork
Proceedings of ICLR 2021 (spotlight)
PDF |

Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with 1/N Parameters
Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Luu Anh Tuan, Siu Cheung Hui, Jie Fu
Proceedings of ICLR 2021 (spotlight)
PDF | (Outstanding Paper Award, Top-8)
Pre-Google Work

Are Pretrained Convolutions Better than Pretrained Transformers?
Yi Tay, Mostafa Dehghani, Jai Gupta, Dara Bahri, Vamsi Aribandi, Zhen Qin, Donald Metzler
Proceedings of ACL 2021 (Long Paper)
PDF | Tweet | Press: AI Coffee Break (Youtube)

Structformer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling
Yikang Shen, Yi Tay, Che Zheng, Dara Bahri, Donald Metzler, Aaron Courville
Proceedings of ACL 2021 (Long Paper)
PDF | Code at Google Research Github
Yikang’s Internship Work at Google Research

Are Model Diagnostics Reliable?
Vamsi Aribandi, Yi Tay, Donald Metzler
Proceedings of ACL 2021 (Short Paper, Findings of ACL)
PDF |
Vamsi’s work as Google AI resident

On Orthogonality Constraints of Transformers
Aston Zhang, Alvin Chan, Yi Tay, Jie Fu, Shuohang Wang, Shuai Zhang, Huajie Shao, Shuochao Yao, Roy Ka-Wei Lee
Proceedings of ACL 2021 (Short Paper)
PDF |
Pre-Google Work

Do Transformer Modifications Transfer Across Implementations and Applications?
Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel
Proceedings of EMNLP 2021
PDF | Press: Synced Review

Generative Models are Unsupervised Predictors of Page Quality: A Colossal-Scale Study
Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, Cliff Brunk, Andrew Tomkins
Proceedings of WSDM 2021
PDF | (Best Paper Award Runner-Up)

Rethinking Search: Making Domain Experts out of Dilettantes
Donald Metzler, Yi Tay, Dara Bahri, Marc Najork
ACM SIGIR forum
PDF |
Press: Wired | MIT Technology Review

Knowledge Router: Learning Disentangled Representations for Knowledge Graphs
Shuai Zhang, Xi Rao, Yi Tay, Ce Zhang
Proceedings of NAACL 2021
PDF |

Self-Instantiated Recurrent Units with Dynamic Soft Recursion
Aston Zhang, Yi Tay, Yikang Shen, Alvin Chan, Shuai Zhang
Proceedings of NeurIPS 2021
PDF |
Pre-Google work

Year of 2020

Sparse Sinkhorn Attention
Yi Tay, Dara Bahri, Liu Yang, Donald Metzler, Da-Cheng Juan
Proceedings of ICML 2020
PDF |

Reverse Engineering Configurations of Neural Text Generation Models
Yi Tay, Dara Bahri, Che Zheng, Clifford Brunk, Donald Metzler, Andrew Tomkins
Proceedings of ACL 2020 (Short Paper)
PDF |

Would you Rather? A New Benchmark for Learning Machine Alignment with Cultural Values and Social Preferences
Yi Tay, Donovan Ong, Jie Fu, Alvin Chan, Nancy Chen, Anh Tuan Luu and Christopher Pal
Proceedings of ACL 2020 (Short Paper)
PDF |

Interactive Machine Comprehension via Information Seeking Agents
Xingdi Yuan, Jie Fu, Marc-Alexandre Cote, Yi Tay, Christopher Pal, Adam Trischler
Proceedings of ACL 2020 (Long Paper)
PDF | Code

Jacobian Adversarially Regularized Networks for Robustness
Alvin Chan, Yi Tay, Yew Soon Ong, Jie Fu
Proceedings of ICLR 2020
PDF |

What it Thinks is Important is Important: Robustness Transfers Through Input Gradients
Alvin Chan, Yi Tay, Yew Soon Ong
Proceedings of CVPR 2020
PDF |

HyperML: A Boosting Metric Learning Approach in Hyperbolic Space for Recommender Systems
Lucas Vinh Tranh, Yi Tay, Shuai Zhang, Gao Cong, Xiaoli Li
Proceedings of WSDM 2020 (Best Paper Award Runner-Up)
PDF |

Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Alvin Chan, Yi Tay, Yew Soon Ong, Aston Zhang
Proceedings of EMNLP 2020 (Findings)
PDF |

Choppy: Cut Transformers for Ranked List Truncation
Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, Andrew Tomkins
Proceedings of SIGIR 2020 (Short Paper)
PDF |

Multi-level Head-wise Match and Aggregation in Transformer for Textual Sequence Matching
Shuohang Wang, Yunshi Lan, Yi Tay, Jing Jiang, Jingjing Liu
Proceedings of AAAI 2020
PDF |

Year of 2019

Compositional De-Attention Networks
Yi Tay, Luu Anh Tuan, Aston Zhang, Shuohang Wang, Siu Cheung Hui
Proceedings of NeurIPS 2019
PDF

Quaternion Knowledge Graph Embedding
Shuai Zhang, Yi Tay, Lina Yao, Qi Liu
Proceedings of NeurIPS 2019
PDF

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks
Yi Tay, Aston Zhang, Anh Tuan Luu, Jinfeng Rao, Shuai Zhang, Shuohang Wang, Jie Fu, Siu Cheung Hui
Proceedings of ACL 2019 (Long Paper)
PDF | Code |
Featured in ICML 2019 tutorial

Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
Yi Tay, Shuohang Wang, Anh Tuan Luu, Jie Fu, Minh C. Phan, Xingdi Yuan, Jinfeng Rao, Siu Cheung Hui, Aston Zhang
Proceedings of ACL 2019 (Long Paper, Oral Presentation)
PDF |

Robust Representation Learning of Biomedical Names
Minh C. Phan, Aixin Sun, Yi Tay
Proceedings of ACL 2019 (Long Paper)
PDF |

Confusionset-guided Pointer Networks for Chinese Spelling Check
Dingmin Wang, Yi Tay, Li Zhong
Proceedings of ACL 2019 (Short Paper, Oral Presentation)
PDF |

Bridging the Gap of Relevance Matching and Semantic Matching with Hierarchical Co-Attention Network
Jinfeng Rao, Linqing Liu, Yi Tay, Wei Yang, Peng Shi, Jimmy Lin
Proceedings of EMNLP 2019 (Long Paper)
PDF | Code

Holographic Factorization Machines for Recommendation
Yi Tay, Shuai Zhang, Anh Tuan Luu, Siu Cheung Hui, Lina Yao, Lucas Vinh Tran
Proceedings of AAAI 2019
PDF |

Quaternion Collaborative Filtering for Recommendation
Shuai Zhang, Lina Yao, Lucas Vinh Tranh, Aston Zhang, Yi Tay
Proceedings of IJCAI 2019
PDF |

Interact and Decide: Medley of Sub-Attention Networks for Group Recommendation
Lucas Vinh Tranh, Tuan-Anh Nguyen Pham, Yi Tay, Yiding Liu, Gao Cong, Xiaoli Li
Proceedings of SIGIR 2019 (Full Paper)
PDF |

Deep Learning based Recommender System - A Survey and New Perspectives
Shuai Zhang, Lina Yao, Aixin Sun, Yi Tay
Proceedings of ACM Computing Surveys (2019)
PDF | Code

Year of 2018

Recurrently Controlled Recurrent Networks
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of NeurIPS 2018
PDF | Code

Densely Connected Attention Propagation for Reading Comprehension
Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su
Proceedings of NeurIPS 2018
PDF | Code

Reasoning with Sarcasm by Reading In-Between
Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su
Proceedings of ACL 2018 (Long Paper, Oral)
PDF

Compare, Compress and Propagate: Enhancing Neural Architectures with Alignment Factorizaton for Natural Language Inference
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of EMNLP 2018 (Long Paper)
PDF | Code

Co-stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of EMNLP 2018 (Long Paper)
PDF | Code

Multi-Granular Sequence Encoding via Dilated Composition Units for Reading Comprehension
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of EMNLP 2018 (Long Paper)
PDF | Code

Attentive Gated Lexicon Reader via Contrastive Contextual Co-Attention for Sentiment Classification
Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Jian Su
Proceedings of EMNLP 2018 (Long Paper)
PDF

Multi-Pointer Co-Attention Networks for Recommendation
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of KDD 2018 (Oral)
PDF | Code

Multi-Cast Attention Networks
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of KDD 2018
PDF

Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of WWW 2018
PDF | Code

Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of WSDM 2018
PDF | Code

CoupleNet: Paying Attention to Couples with Coupled Attention for Relationship Recommendation
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of ICWSM 2018
PDF | Dataset

Cross Temporal Recurrent Networks for Ranking Question Answer Pairs
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of AAAI 2018 (Oral)
PDF | Dataset

Learning to Attend via Word-Aspect Associative Fusion for Aspect-based Sentiment Analysis
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of AAAI 2018
PDF | Dataset

SkipFlow: Incorporating Neural Coherence Features for End-to-End Automatic Text Scoring
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of AAAI 2018
PDF

Hermitian Co-Attention Networks for Text Matching in Asymmetrical Domains
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of IJCAI 2018
PDF

Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All
Minh C. Phan, Aixin Sun, Yi Tay, Jialong Han, Chenliang Li
Proceedings of IEEE TKDE 2018
PDF

2017 and Before

Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture
Yi Tay, Minh C.Phan, Anh Tuan Luu, Siu Cheung Hui
Proceedings of SIGIR 2017
PDF | Dataset

Random Semantic Tensor Ensemble for Scalable Link Prediction on Knowledge Graphs
Yi Tay, Anh Tuan Luu, Siu Cheung Hui, Falk Brauer
Proceedings of WSDM 2017
PDF | Dataset

Non-parametric Estimation of Multiple Embeddings for Link Prediction on Dynamic Knowledge Graphs
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of AAAI 2017
PDF

Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs
Yi Tay, Anh Tuan Luu, Minh C.Phan, Siu Cheung Hui
Proceedings of CIKM 2017
PDF

Dyadic Memory Networks for Aspect-based Sentiment Analysis
Yi Tay, Anh Tuan Luu, Siu Cheung Hui
Proceedings of CIKM 2017
PDF

NeuPL: Attention-based Semantic Matching and Pair Linking for Entity Disambiguation
Minh C. Phan, Aixin Sun, Yi Tay, Jialong Han, Chenliang Li
Proceedings of CIKM 2017
PDF

Cross Device User Linking : URL, Session, Visiting Time and Device Log Embedding
Minh C. Phan, Aixin Sun, Yi Tay
Proceedings of SIGIR 2017 (Short Paper)
PDF | Code

Learning Term Embeddings for Taxonomic Relation Identification with Dynamic Weighting Neural Network
Anh Tuan Luu, Yi Tay, Siu Cheung Hui, See Kiong Ng
Proceedings of EMNLP 2016 (Long Paper)
PDF

Workshops and Demos

Detecting Waterborne Debris with Sim2Real and Randomization
Jie Fu, Ritchie Ng, Mirgahney Mohamed, Yi Tay, Kris Sankaran,
Shangbang Long, Alfredo Canziani, Chris Pal, Moustapha Cisse
Proceedings of ICML 2019 - AI4SocialGood Workshop
PDF

Next Item Recommendation with Self-Attentive Metric Learning
Shuai Zhang, Yi Tay, Lina Yao, Aixin Sun, Jake An
Proceedings of AAAI 2019 - RecNLP Workshop

DeepRec: An Open-source Toolkit for Deep Learning based Recommendation
Shuai Zhang, Yi Tay, Lina Yao, Bin Wu, Aixin Sun
Proceedings of IJCAI 2019 - Demo Track

Book Chapters

Recommender Systems
Shuai Zhang, Aston Zhang, Yi Tay
Book Title: Dive into Deep Learning
Book Authors: Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola

Deep Neural Networks Based Recommender Systems
Shuai Zhang, Yi Tay, Lina Yao, Aixin Sun, Ce Zhang
The 3rd Edition of the Recommender Systems Handbook (Springer)
Book Authors: Francesco Ricci, Lior Rokach, Bracha Shapira, Paul B. Kantor.

Thesis and Dissertation

Neural Architectures for Natural Language Understanding
Yi Tay
PhD Thesis, Nanyang Technological University
PDF |