Shangbin photo

Shangbin Feng

Greetings! I am a Ph.D. student at University of Washington, advised by Yulia Tsvetkov. I work on natural language processing, knowledge bases, and social network analysis.

Links:   CV   Email   Twitter   Github   Google Scholar   Semantic Scholar


Publications

2023

CooK Teaser

CooK: Empowering General-Purpose Language Models with Modular and Collaborative Knowledge

Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, Yulia Tsvetkov

arxiv 2023   paper  

We propose CooK, a community-driven iniatitive to empower black-box LLMs with modular and collaborative knowledge. By incorporating the outputs of independently trained, small, and specialized LMs, we make LLMs better knowledge models by empowering them with temporal knowledge update, multi-domain knowledge synthesis, and continued improvement through collective efforts.

NLGraph Teaser

Can Language Models Solve Graph Problems in Natural Language?

Heng Wang=, Shangbin Feng=, Tianxing He, Zhaoxuan Tan, Xiaochuang Han, Yulia Tsvetkov

arxiv 2023   paper   code  

Are language models graph reasoners? We propose the NLGraph benchmark, a test bed for graph-based reasoning designed for language models in natural language. We find that LLMs are preliminary graph thinkers while the most advanced graph reasoning tasks remain an open research question.

PoliLean Teaser

From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models

Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov

ACL 2023   🏆 Best Paper Award   paper   code   Washington Post   MIT Tech Review  

We propose to study the political bias propagation pipeline from pretraining data to language models to downstream tasks. We find that language models do have political biases, such biases are in part picked up from pretraining corpora, and they could result in fairness issues in LM-based solutions to downstream tasks.

FactKB Teaser

FactKB: Generalizable Factuality Evaluation using Language Models Enhanced with Factual Knowledge

Shangbin Feng, Vidhisha Balachandran, Yuyang Bai, Yulia Tsvetkov

arxiv 2023   paper   demo   code  

We propose a simple, easy-to-use, shenanigan-free summarization factuality evaluation model by augmenting language models with factual knowledge from knowledge bases.

KALM Teaser

KALM: Knowledge-Aware Integration of Local, Document, and Global Contexts for Long Document Understanding

Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov

ACL 2023   paper   code  

We propose KALM, a Knowledge-Aware Language Model that jointly incorporates external knowledge in three levels of document contexts: local, document-level and global.

BIC Teaser

BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency

Zhenyu Lei=, Herun Wan=, Wenqian Zhang, Shangbin Feng, Zilong Chen, Qinghua Zheng, Minnan Luo

ACL 2023   paper   code  

We propose to leverage text-graph interaction for Twitter bot detection while modeling semantic consistency in user timelines.

BotMoE Teaser

BotMoE: Twitter Bot Detection with Community-Aware Mixtures of Modal-Specific Experts

Yuhan Liu, Zhaoxuan Tan, Heng Wang, Shangbin Feng, Qinghua Zheng, Minnan Luo

SIGIR 2023   paper   code  

We propose community-aware mixture-of-experts to address two challenges in detecting advanced Twitter bots: manipulated features and diverse communities.

BotPercent Teaser

BotPercent: Estimating Twitter Bot Populations from Groups to Crowds

Zhaoxuan Tan=, Shangbin Feng=, Melanie Sclar, Herun Wan, Minnan Luo, Yejin Choi, Yulia Tsvetkov

arxiv 2023   paper   code  

We make the case for community-level bot detection, proposing the system BotPercent to estimate the bot populations from groups to crowds. Armed with BotPercent, we investigate the overall bot percentage among active users, bot precense in the Trump reinstatement vote, and more, yielding numerous interesting findings with implications for social media moderation.

KRACL Teaser

KRACL: Contrastive Learning with Graph Context Modeling for Sparse Knowledge Graph Completion

Zhaoxuan Tan, Zilong Chen, Shangbin Feng, Qingyue Zhang, Qinghua Zheng, Jundong Li, Minnan Luo

The Web Conference 2023   paper   code  

We propose to leverage contrastive learning in knowledge graph completion to alleviate the sparsity challenge in existing KGs.

2022

PAR Teaser

PAR: Political Actor Representation Learning with Social Context and Expert Knowledge

Shangbin Feng, Zhaoxuan Tan, Zilong Chen, Ningnan Wang, Peisheng Yu, Qinghua Zheng, Minnan Luo

EMNLP 2022   paper   code   poster  

We propose to learn representations of polical actors with social context and expert knowlegde, while applying learned representations to tasks in computational political science.

TwiBot-22 Teaser

TwiBot-22: Towards Graph-Based Twitter Bot Detection

Shangbin Feng=, Zhaoxuan Tan=, Herun Wan=, Ningnan Wang=, Zilong Chen=, Binchi Zhang=, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, Xinshun Feng, Qingyue Zhang, Hongrui Wang, Yuhan Liu, Yuyang Bai, Heng Wang, Zijian Cai, Yanbo Wang, Lijing Zheng, Zihan Ma, Jundong Li, Minnan Luo (* indicates equal contribution)

NeurIPS 2022, Datasets and Benchmarks Track   website   paper   code   poster  

We make the case for graph-based Twitter bot detection and propose a graph-based benchmark TwiBot-22, which addresses the issues of limited dataset scale, incomplete graph structure, and low annotation quality in previous datasets.

GraTo Teaser

GraTo: Graph Neural Network Framework Tackling Over-smoothing with Neural Architecture Search

Xinshun Feng, Herun Wan, Shangbin Feng, Hongrui Wang, Qinghua Zheng, Jun Zhou, Minnan Luo

CIKM 2022   paper   code  

We present GraTo, an NAS-GNN framework to tackle over-smoothing in graph mining.

datavoidant Teaser

Datavoidant: An AI System for Addressing Political Data Voids on Social Media

Claudia Flores-Saviaga, Shangbin Feng, Saiph Savage

CSCW 2022   paper   code  

We present Datavoidant, an AI-powered system to combat political data voids within underrepresented communities.

KCD Teaser

KCD: Knowledge Walks and Textual Cues Enhanced Political Perspective Detection in News Media

Wenqian Zhang=, Shangbin Feng=, Zilong Chen=, Zhenyu Lei, Jundong Li, Minnan Luo (* indicates equal contribution)

NAACL 2022, oral presentation   paper   code   invited talk slides  

We introduce the mechanism of knowledge walks to enable multi-hop reasoning on knowledge graphs and levearge textual labels in graphs for political perspective detection.

HeteroBot Teaser

Heterogeneity-aware Twitter Bot Detection with Relational Graph Transformers

Shangbin Feng, Zhaoxuan Tan, Rui Li, Minnan Luo

AAAI 2022   paper   code   poster

We introduce relational graph transformers to model relation and influence heterogeneities on Twitter for heterogeneity-aware Twitter bot detection.

2021

PPD Teaser

KGAP: Knowledge Graph Augmented Political Perspective Detection in News Media

Shangbin Feng=, Zilong Chen=, Wenqian Zhang=, Qingyao Li, Qinghua Zheng, Xiaojun Chang, Minnan Luo (* indicates equal contribution)

arxiv 2021   paper   code  

We construct a political knowledge graph and propose a graph-based approach for knowledge-aware political perspective detection.

BotRGCN Teaser

BotRGCN: Twitter Bot Detection with Relational Graph Convolutional Networks

Shangbin Feng, Herun Wan, Ningnan Wang, Minnan Luo

ASONAM 2021 Short   paper   code  

We propose a graph-based approach for Twitter bot detection with relational graph convolutional networks and four aspects of user information.

TwiBot-20 Teaser

TwiBot-20: A Comprehensive Twitter Bot Detection Benchmark

Shangbin Feng, Herun Wan, Ningnan Wang, Jundong Li, Minnan Luo

CIKM 2021, Resource Track   paper   code   poster

We propose a (the first) comprehensive Twitter bot detection benchmark that covers diversified users and supports graph-based approaches.

SATAR Teaser

SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detection

Shangbin Feng, Herun Wan, Ningnan Wang, Jundong Li, Minnan Luo

CIKM 2021, Applied Track   paper   code   poster

We propose to pre-train Twitter user representations with follower count and fine-tune on Twitter bot detection.


Miscellaneous