Co-lab was originally created, as its name implies, to conduct research in collaborative systems and hypermedia. As our department shifts its research direction to focus on Information Integration and Informatics (III, i.e. data intensive research) and Human-Centered Computing (HCC), this lab is now primarily used to conduct data intensive research.
All IS and CS Faculty and their Ph.D. students are welcome to be members of the Co-Lab. However, membership is managed on a per person basis, that is, the lab is focused on data intensive research and on web-based knowledge bases and systems that include hypermedia (web linking and navigation) including Digital Libraries; so projects with these foci are given first priority.
To be the “work space” on campus for IS PhD students, full time as well as part time (along with our other labs).
To support faculty and PhD student research in the data intensive research areas.
To provide a collaborative environment and computer support for visiting scholars and visiting researchers to the IS Department.
There are several different parts of the Lab:
- The development area is used by faculty and students working on building systems and on specific research projects involving system design and evaluation, or data collection and analysis related to information systems functionality and usability to meet user needs. Two carrels in the back of this area are designated for priority use by funded projects that include the collection of sensitive data or the operation of a system needing security and not casual access by any passer-by.
- Storage areas: two large closets and many cabinets. Full time PhD students who are members of the lab are assigned a locked cabinet for secure storage of their personal materials. They also have individual alarm codes for access after hours.
Day to Day Management:
The lab directors set policies and priorities. One or two advanced PhD students with experience in the Lab are appointed lab manager (or co-managers, sometimes hardware and people management are divided). This student supervises the scheduling of lab monitors from among the PhD students (each supported PhD student who is a member of the Lab, whether supported on department or grant funds, is to spend five hours a week as a lab monitor) and handles hardware and software problems and updates, with support from Computing Services. All regularly open lab hours are covered by a lab monitor, who maintains the security, decorum, and neatness of the lab and reports any problems with hardware, software, or supplies to the lab manager. The lab manager takes care of routine problems and reports to the lab directors if there is a problem needing their advice or intervention. Traditionally, the PhD student—lab manager receives as compensation, an assignment to only manage the lab, during their last semester when they are finishing their dissertation and interviewing for a job. In the other semester of the year or years they serve as manager, they also have a regular teaching or research assignment and do the lab managing on top of this.
History and Funding:
The original equipment was purchased through: NSF CISE Instrumentation Grant, "Collaborative Hypermedia", (9818309) $90,000, 1999 -Jan. 31 2003. (Michael Bieber, PI ; SR Hiltz and M. Turoff, co-PIs).
The special furniture, such as a the large Cherry U-shaped table that cost approximately $18,000 since it had to be custom made, was purchased by grant money from the UPS Foundation, from a proposal written by Distinguished Professors Emeriti Murray Turoff and Starr Roxanne Hiltz.
Additional hardware and software were purchased through subsequent grants, including:
1. CISE August 2002, $190,000 Starr Roxanne Hiltz, PI; and
2. New Jersey Center for Pervasive Information Technology, from the New Jersey Commission on Science and Technology, (joint with Princeton, S.R. Hiltz, co-PI, (9/2000-12/ 2005); and
3. Sloan Foundation: WebCenter for Learning Networks Effectiveness Research. Starr Roxanne Hiltz, PI; $370,000, Jan. 2001- December 2004). Extension grant through December 2005.
Currently, the university handles hardware for data intensive research.
Over 20 faculty members from NJIT and several visiting scholars from other universities have been involved in these projects as co-investigators or co-authors. In addition, hundreds of students were involved, either as research assistants, or as subjects in experiments who were subsequently debriefed and had an opportunity for hands-on learning about the experimental method.
Current Research Projects Conducted in Co-Lab:
Dr. Brook Wu’s projects:
- Information filtering by multiple examples: This approach utilizes multiple representative articles provided by a user as positive samples to represent a complex information need without the user composing any search query. The system learns from the user samples and ranks the all documents in a document base (such as a digital library), based on their relevance to the information need representing the sample documents provided by the user using a semi-supervised Positive and Unlabeled Learning (PU Learning) approach. To achieve a high level of learning performance even with few positive samples, the system utilizes under-sampling, which is especially beneficial when desired documents similar to the samples are not evenly distributed in the document base.
- Task-based user profiling for personalized query refinement: This project uses the user’s prior search sessions to model his or her evolving search interests with long- and short-term, and positive and negative descriptors. To reduce the noise in the dataset, the clicked pages in the user’s search sessions are represented using social tags to form a pseudo user representation, from where the descriptors in the user’s profile are derived.
- Intent-based user segmentation with query enhancement for online advertisement: This project proposes a query enhancement mechanism that augments a user’s queries by leveraging the user’s query log, which provides more useful context for the user’s interests and hence reduces the ambiguity in the inferred user’s intent.
- Automatically generating audience level metadata for digital library resources: This project trains a support vector machine classifier to label digital library resources by subject and reading level automatically.
- Concept chaining utilizing meronyms in text characterization: This project utilizes semantic and linguistic content categorization, which will facilitate improved access methods for digital library resources.
Dr. Lian Duan’s projects:
- Dr. Duan is currently working on a new framework for evaluating correlation. Because of randomness, existing methods either evaluate the degree of correlation without controlling the type I error rate, or evaluate the type I error rate from the independence instead of a certain degree of correlation. His three main directions are: (1) developing a framework to properly measure the degree of correlation with a controlled type I error rate. (2) studying the bias of different measures under his framework and providing guidelines to choose the proper measure under different situations. (3) providing appropriate data structures to speed up the correlation search. Such research has significant impacts on social network and healthcare domains. It enables more precisely measuring the impact of one node on others in social network or one drug on adverse drug reactions. In addition, it also improves a very popular community detection method.
Dr. Songhua Xu’s projects:
- Dr. Xu is working on several projects concerning big data processing, especially in the healthcare domain. He also is researching text extraction from images, and a much more effective approach to patent search. Among his big data projects is one to study the rich amount of personal information shared openly among cancer patients and cancer-free people online. He is developing domain-specific informatics tools to automatically reconstruct people's spatio-temporal lifelines, link them to spatio-temporal environmental data available from online sources such as the Environmental Protection Agency, and mine them using machine learning methods to search for salient associations between changes of migration-influenced environmental exposure and cancer risk.