I am a system engineer with interests across IOT, Big Data, and Machine Learning.

I received my Ph.D on computer science from Institute of software Chinese academy of sciences in 2012, and B.E. from department of Computer science and technology at Tsinghua University in 2006. I visited Michigan State University in 2009 as a visiting scholar.

I like photography and reading books. To find some of my photos, please visit my 500px.

Feel free to reach me at liqul (at) outlook.com, though I may not be very responsive.

Work Experience

  • 2016.11 - now Architect at K2Data.
  • 2016.5 - 2016.11 Engineer at 高德地图.
  • 2014.11 - 2016.2 Engineer at PPzuche.com.
  • 2012.8 - 2014.11 Researcher in Microsoft Research Asia (MSRA).

Selected Projects

Data Management for Industrial IOT

Time series is the first class citizen in industrial scenarios. Machines with hundreds of sensors generate tons of time series data that need to be stored and analyzed, demanding for a scalable and reliable storage service. However, existing solutions (e.g., OpenTSDB, Influxdb, and Timescaledb) either adopt a single node deployment, or luck of an aproperiate data model. Therefore, we developed a new time series storage service, leveraging existing open sourced projects such as Hadoop, Kafka, Zookeeper, and Parquet. We built key building blocks to enable atomic data ingestion, partitioning, and compaction. Time series data can be directly read, processed, and analyzed by parallel computing frameworks such as Map-reduce or Spark.

Machines not only generate time series, but also a huge number of objects. Storing those files sounds trivial, while analyzing them incurs challenges, especially on a huge number of files. Our goal is to build an object storage service which is capable of holding a huge number of files, as well as offering easy interfaces for data analysis. For this purpose, we adopt the data model of object = metadata + file. The metadata is indexed by Elasticsearch, and we implement atomic CURD. We keep the file in HDFS, where Map-reduce or Spark programs can directly read the files. To prevent from the “small-file-problem”, a background housekeeper process continuously compacts small files.

Transportation Activity Recognition on Smartphones

In data crowdsourcing, it is useful to know your user’s transportation status. For instance, for a map maintainer, knowing the transportation status helps to distinguish if the data is collected on road, sidewalk, or in building. However, it is nontrivial to approach high accuracy. Practical challenges include unknown phone gesture, random noises, and achieving high energy efficiency. I designed a framework combining both inputs from the inertial sensors (accelerometer and gyroscope) and various contextual informations. Extraneous events incurred by user random activities are filtered with intuitive rules discovered through our training data from tens of people. We finally achieved a significantly better accuracy compared with existing frameworks from Google and Samsung.

Indoor localization

Localization in indoor areas is nontrivial due to the fact that GPS signals cannot penetrate walls. To tackle this challenge, we leverage various signals, e.g., Wi-Fi, Bluetooth, Geomagnetic field, IMU sensors, and even visible light, which are commonly found in indoor environment.

Localization

Specifically, we proposed a Wi-Fi based positioning system called Modellet. Modellet takes advantage of both fingerprint-based and model-based approaches, and is able to adapt to environmental locality and training data density. We evaluate Modellet with data collected from venues (office, airport, and shopping mall) across China, Germany, and the U.S. We show that Modellet outperforms Radar and EZPerfect. Secondly, we build a system called Magicol adding more modalities, i.e., Geomagnetic field and IMU sensors, to the system to improve localization accuracy. We leverage the particle filter framework to fuse the signals. Evaluation results show that Magicol achieves around 2-meter accuracy in office and shopping mall areas. Finally, we explore the possibility of positioning with visible light emitted from LEDs. LED bulbs are modified to blink in unique patterns (invisible to human eyes). Any device with a light sensor can decode and estimate its own postion based on the light energy propagation model. The system is called Epsilon which achieves submeter-level accuracy.

Wireless Sensor Network

Sensor Networking

Wireless sensor network (WSN) typically refers to a large number of networked embedded devices, called sensor nodes. In WSNs, data is transmitted from one node to another in a multi-hop minor. WSNs are usually deployed in harsh environments like in forest or around volcano, and therefore the nodes face frequent failures. I studied several issues raised from WSNs. Specifically, I develop voice-streaming systems (namely QVS and ASM) which are aware of the voice quality. These systems prevent the problem of network congestion with an admission control protocol. I also investigate the time synchronization problem where we need to maintain accurate relative time between sensor nodes. I exploit the regular pattern of the RDS data carried by the FM radio signal for energy efficient millisecond-level time synchronization in city-scale sensor networks.

Publications

Indoor Localization

Wireless Sensor Networking

Misc

Awards

  • 2nd place in IPSN Indoor Localization Competition in 2014 held in Berlin, Germany.
  • Excellent Ph.D. thesis award by Chinese Academy of Sciences in 2013.