Salman Khan bio photo

Salman Khan

I'm a computer vision researcher affiliated with MBZUAI and ANU.

Twitter   G. Scholar LinkedIn E-Mail

Code & Data

Datasets and Protocols

  • AnimalWeb - A Large-Scale Hierarchical Dataset of Annotated Animal Faces: We introduce a largescale, hierarchical annotated dataset of animal faces, featuring 21.9K faces captured ‘in-the-wild’ conditions. These faces belong to 334 diverse species, while covering 21 different animal orders across biological taxonomy. Each face is consistently annotated with 9 landmarks on key facial features. It is structured and scalable by design; its development underwent four systematic stages involving rigorous, manual annotation effort of over 6K man-hours. We benchmark the proposed dataset for face alignment using the existing art under two new problem settings. Results showcase its challenging nature, unique attributes and present definite prospects for novel, adaptive, and generalized face-oriented CV algorithms. We further benchmark the dataset across related tasks, namely face detection and fine-grained recognition, to demonstrate multi-task applications and opportunities for improvement. For more details, please see our paper and dataset page.
  • iSAID - A Large-scale Dataset for InstanceSegmentation in Aerial Images: Existing Earth Vision datasets are either suitable for semantic segmentation or object detection. iSAID is the first benchmark dataset for instance segmentation in aerial images. This large-scale and densely annotated dataset contains 655,451 object instances for 15 categories across 2,806 high-resolution images. The distinctive characteristics of iSAID are the following: (a) large number of images with high spatial resolution, (b) fifteen important and commonly occurring categories, (c) large number of instances per category, (d) large count of labelled instances per image, which might help in learning contextual information, (e) huge object scale variation, containing small, medium and large objects, often within the same image, (f) Imbalanced and uneven distribution of objects with varying orientation within images, depicting real-life aerial conditions, (g) several small size objects, with ambiguous appearance, can only be resolved with contextual reasoning, (h) precise instance-level annotations carried out by professional annotators, cross-checked and validated by expert annotators complying with well-defined guidelines. For more detail, please refer to our paper and the dataset page.
  • ImageNet Zero-Shot Object Detection Protocol: The train/val/test splits for zero-shot object detection based on ILSVRC object detection dataset are avilable here. The intructions on how to use the proposed splits are available here. The motivation and details for the proposed train and test protocol can be found in the associated publication and project page.
  • MS-COCO Zero-Shot Object Detection Protocol: The train/val/test splits for zero-shot object detection based on MS-COCO object detection dataset are avilable here. The intructions on how to use the proposed splits are available here. The motivation and details for the proposed train and test protocol can be found in the associated publication and project page.
  • Object Categories in Indoor Scenes: This database contains a total of 15,324 images spanning more than 1300 frequently occurring indoor object categories. The database can potentially be used for fine-grained scene categorization, high-level scene understanding and attribute-based reasoning. The dataset is available for download here. More details about the dataset can be found in the associated publication.

Codes