Sankalok Sen

About

Hello, my name is Sankalok (pronounced shaun-ko-lock). I like thinking deeply about complicated problems, writing down and discussing possible solutions with colleagues, and tinkering with computers/servers while fully using up all of available resources. I also try to publish once in a while.

Experience

Since early 2024, I am a Research Engineer in the System Theory Group @ Huawei’s Theory Lab in Hong Kong. Currently, I mostly spend my time multiplying matrices to build fancy applications (LLMs, VLMs, OCR models, all of the fancy-schmancy stuff for B2B products) in Python. Last year, I sorted and manipulated numbers to build fancy applications (In-memory and On-disk Vector Search for Cloud and Terminal Storage) in C++. I also like making them run fast. In fact, our team managed to make it run so fast its now among the top 3 algorithms in the ANN benchmarks! (check out kgn :D)

Previously I interned and worked full-time (my first job) as a Junior Research Scientist at B. Y. Quantitative Medicine (a cancer research start-up). During my internships, I tried making sense of different types of cross-platform High Grade Serous Ovarian Cancer datasets. During my stint there as a full-timer, I tried to model these aforementioned datasets to make them make sense (for chemo-resistivity prediction).

Among my not so recent internships I had interned as an AI and Finance Research Intern at AskLora (great people, great culture, learnt a lot of practical data science that I still use to this day) where I tried to process financial data into fancy schematic forms for database storage (like ScyllaDB); and as a MITACS Globalink Research Intern at Saint Mary’s University in Nova Scotia, where I got my initial exposure to Natural Language Processing principles in general when I attempted to dissect Corporate Social Responsibility Reports for the Fortune 500.

A long long time ago, I was a Teaching Assistant at my alma-mater (The University of Hong Kong, where I somehow graduated with a Bachelor’s in Computer Science while picking up a minor in Statistics) where I taught Python to freshmen. For my undergraduate thesis, I had worked in building some nice (looking back, pretty okayish rather) computational linguistics algorithms to help understand Hong Kong Legal Judgments.

Publications

  1. Yazheng Yang, Yuqi Wang, Yaxuan Li, Sankalok Sen, Lei Li, Lin Qiu and Qi Liu, Unlock the Potential of Large Language Models for Predictive Tabular Tasks in Data Science with Table-Specific Pretraining, In Review: IEEE Transactions of Knowledge and Data Engineering (TKDE), 2025.
  2. Srinjoy Bhuiya, Ayushman Kumar, and Sankalok Sen, Exploring the Effects of Data Augmentation for Drivable Area Segmentation, In SCRS Proceedings of International Conference of Undergraduate Students, SCRS (ICUS), India, 2023.
  3. Sankalok Sen, Analyzing Hong Kong’s Legal Judgments from a Computational Linguistics point-of-view, Undergraduate Thesis, 2023.

Disclaimer: Opinions are my own and do not reflect those of my employer.