Skip to content

Latest commit

 

History

History
executable file
·
124 lines (100 loc) · 6.71 KB

README.md

File metadata and controls

executable file
·
124 lines (100 loc) · 6.71 KB

Web-Search-Field-Study-Toolkit

Make field study easier to conduct!

THUIR License made-with-python made-with-js repo-size

Introduction

This codebase contains source-code of the field study platform of our WWW 2021 paper:

Overview

List of Recorded Information

  • Pre-query expectation: such as diversity, result type, redundancy, difficulty, number of relevant results, effort.
  • Query reformulation: such as reformulation type, reformulation interface, reformulation reason, reformulation inspiration source, etc.
  • Query-level result usefulness: 4-scale, 0--useless, 1--partially useful, 2--very useful, 3--serendipity.
  • Query-level and session level user satisfaction: both are 5-scale.
  • Search behavior log: such as mouse movement, search queries, and timestamps.

You can add or delete any function on your need. BTW, we are delighted to introduce the dataset we collected via this toolkit: TianGong-Qref. 🤠

Environment compatibility

  • Python>=2.7
  • Django>=1.8.3

Support

Fow now, this toolkit only support the logging on Baidu and Sogou, which are two largest commercial search engines in China. We welcome anyone to implement the support for more search engines such as Google, Bing, Yahoo, and Naver.

How to launch

  • As our toolkit use MongoDB to store data, you should first make sure that your django backend has been connected to a running MongoDB engine. Try the following script on Linux OS/MacOS to make sure that you have launched the MongoDB correctly:
cd /usr/local/bin
sudo ./mongod

Then open another terminal window and run the following command:

cd /usr/local/bin
./mongo

If you have any problems about MongoDB, please refer to this tutorial.

  • To initialize the MongoDB database:
python manage.py makemigrations user_system
python manage.py makemigrations task_manager
python manage.py migrate
  • You can then launch the django backend with the following command:
python manage.py runserver 0.0.0.0:8000
  • Install the chrome extension on your Google chrome.

  • Login at the annotation platform (0.0.0.0:8000) and register a new account.

  • Click the extension logo and login with the account.

  • Now, all things get ready! Just start your field study!

Some things you should notice

  • The baseURL in the extension should be the same with the base URL of the annotation platform.
var baseUrl = "http://127.0.0.1:8000";
  • You should ensure that the chrome extention is on before the search, or nothing will be recorded.

  • There may be problems in query recording if search users submit queries very frequently, e.g., submit two queries within 1 second. Please ask the participants to search with normal speed. We also welcome anyone to fix this bug.
  • Each query that has been recorded should be annotated within 48 hours, or they will be removed in case that users have forgotten the search details.
  • It is normal to have error as follows when submitting the annotations for a query. Just return the previous page and submit again.

  • For Baidu, you should 1) shut down the instant predicton function, and 2) set all SERPs to be opened in a new window. Without these settings, search pages will be updated merely by in-page javascript functions and our toolkit will fail to record correct information.

Citation

If you find the resources in this repo useful, please do not save your star and cite our work:

@inproceedings{chen2021towards,
  title={Towards a Better Understanding of Query Reformulation Behavior in Web Search},
  author={Chen, Jia and Mao, Jiaxin and Liu, Yiqun and Zhang, Fan and Zhang, Min and Ma, Shaoping},
  booktitle={Proceedings of the Web Conference 2021},
  pages={743--755},
  year={2021}
}

Contact

If you have any questions, please feel free to contact me via chenjia0831@gmail.com or open an issue.

Acknowledgement

This toolkit is built based on the prototype systems that were used in several previous work: