Plenary Talks & Speakers

  logo iuprai iapr Springer-logo-logotype


Prof. Bidyut Baran Chaudhuri
Indian Statistical Institute – Kolkata

Title: A Kalman filtering based data clustering approach and evaluation of clustering methods


Research on Clustering has gained momentum with recent interest in Data mining and Knowledge discovery problems. To obtain good clustering, heuristic optimization based methods have gained popularity. A Kalman filtering based heuristic approach called Heuristic Kalman Algorithm (HKA) was proposed a few years ago. Here we show that HKA can be efficiently employed in partitional data clustering. Next, we propose an improved approach named HKA-K that combines HKA and a step of K-means algorithm. It is shown that HKA-K is at least as good as or sometimes better than some other state of the art clustering algorithms. In the second part of the talk, we consider quality evaluation of any Clustering Algorithm (CA) with the benchmark data. For quality we use the accuracy of output clusters. Since a CA cannot label the clusters by class names, for N clusters, N! different sets of name labels can be generated.  Using the data ground truth, the accuracy for each set of labels can be calculated and the highest value over all sets may be accepted as the clustering accuracy. Since O(N!) computation is prohibitively large, here we explore how this can be improved.

Short Biography:

Prof B B Chaudhuri is the INAE Distinguished Professor and J C Bose Fellow as Computer Vision and Pattern Recognition Unit of Indian Statistical Institute. His research interests include Pattern Recognition, Image Processing, Computer Vision, Natural Language Processing (NLP), Signal processing, Cognitive science, Data compression, Digital Document Processing etc. He pioneered the first Indian language Bharati Braille system for the blind, a successful Bangla speech synthesis system as well as the first workable OCR for Bangla, Devnagari, Assamese and Oriya scripts. In NLP area, a robust Indian language spell-checker, morphological processor, multi-word expression detector and statistical analyzer were pioneered by him. Some of his technologies have been transferred to industry for commercialization.  He has published about 400 research papers in reputed international journals, conference proceedings and edited books. Also, he has authored/edited  seven books entitled Two Tone Image Processing and Recognition (Wiley Eastern, 1993), Object Oriented Programming: Fundamentals and Applications (Prentice Hall, 1998), Computer and Software Technology Dictionary (Ananda Publishers, 2002), Digital Document Processing ( Springer-Verlag, 2007), Bangla Sound symbolism: Characteristics and Dictionary (Bangla Academy, 2011), Multimedia Information extraction and digital heritage preservation (World Scientific, 2011), Some Articles on Language Technology (Ananda Publishers, 2012), Advances in Digital Document Processing and Retrieval (World Scientific, 2014). He received Leverhulme fellowship award 1981-82, Sir J. C. Bose Memorial Award in 1986, M. N. Saha Memorial Award (twice) in 1989 and 1991, Homi Bhabha Fellowship award in 1992, Dr. Vikram Sarabhai Research Award in 1995, C. Achuta Menon Prize in 1996, Homi Bhabha Award: Applied Sciences in 2003, Ram Lal Wadhwa Gold Medal in 2005, Jawaharlal Nehru Fellowship during 2004-2006, J C Bose fellowship 2010, Om Prakash Vasin Award 2011. He is the associate editor of six international journals.  Also he is a fellow of IETE, WBAST, INSA, NASc, IAPR, IEEE and The World academy of sciences (TWAS).


Prof. Christian Micheloni
University of Udine, Italy

Title: Fill the gap for non-continuous long term tracking


The recent  advancement  and  the  price  plummet  of  imaging sensor technologies have remarkably increased the adoption of  video  analytics  systems  for  various  application  ranging  from home to border surveillance. However, it is a matter of fact that, due  to  the  amount  of  human  supervision,  privacy  concerns,  and maintenance  costs  involved,  it  is  still  not  possible  to  deploy  camera  networks  covering  all  the  areas  of  a  wide  environment. Thus, only a small portion of it is monitored by cameras which often  have  non-overlapping  fields-of-view  (FoVs).  As  a  result, there exist blind areas from which no information can be directly obtained.  This  raises  the  need  for  methods  able  to  link  the information  acquired  between  the  covered  areas  such  that  high-level  semantics  can  be  obtained. Countless  applications  like  multi-camera  tracking, situational awareness, and multi-camera event detection would benefit from such methods. One  of  the  most currently  attractive  issues  that  such  blind  areas  have  introduced is  the  problem  of  re-associating  a  same  person  that  is  moving in a wide environment and who might be detected at a different location  and  time.  This  is  known  as  the  person  re-identification problem.  The talk will cover the major issues in the person re-identification problem, the main strategies proposed in the state of the art, the most promising baseline approaches and a vision on which will be the advances in the near future.

Short Biography:

Christian Micheloni  is an associate professor at the University of Udine. Since 2000 he has taken part to the European research being under contract for several European Projects. He has co-authored different scientific works published in International Journals and Refereed International Conferences. He has served as a reviewer for several International Journals and has been on the organizing and program committees of different international conferences. His main interests involve active vision for the wide area scene understanding, neural networks for the classification and recognition, resource aware camera networks to establish proper control protocols for improving cognition capabilities. He is also interested in pattern recognition and machine learning. Current research projects include camera network self-reconfiguration and person re-identification. He is member of the International Association of Pattern Recognition (IAPR) and member of the IEEE.


Prof. Prabir Kumar Biswas
Indian Institute of Technology – Kharagpur

Title: Creation and Processing of 3-D Images


Three dimensional image in the form of 3 D point cloud is of utmost importance in various applications involving three dimensional modelling of various objects. The problem has gained importance of different dimension with the awareness of preserving archaeological monuments. 3 D modelling not only helps making digital copies of the monuments and creation of virtual museums, it also helps carrying out accurate restoration whenever required.
In this talk I shall present a low cost 3 D scanner that is capable of capturing minute surface details using laser projection technique. The laser projector casts a plane of light onto the target object, which, intersects with the object to create an illuminated curve on it. The laser illuminated object is captured by the CCD camera. Using
ray-plane intersection, we can compute the 3D coordinates of all points in the camera image illuminated by the laser.
Due to self-occlusion and concavities in the scanned object, it is possible to scan only a part of the object from one point of view. Therefore, we need to perform multiple scans and combine scan data from multiple wide baseline viewpoints onto a common reference coordinate system. Moreover, due to unstructured environment, the pose and the position of the scans are not known beforehand and need to be generated from the information within the scan data itself. For N views, we get a set of point clouds P and a set of texture bitmaps. Key point extraction detects and describes a set of points from each of these views, which are invariant to rigid transformation. In order to ascertain this, key points are usually chosen at corners, blob-like structures, T-junctions, etc. In our implementation we have used Harris 3D corners to detect the key points in the point clouds, as it has been found to yield robust results. A key point needs to be described by a feature vector computed from its neighbourhood. A pair of views is registered by identifying common parts of the object observed in the corresponding point clouds and match at least 4 points to compute the rigid 3D transformation matrix. 3 D transformation matrix is used to transform the point clouds to a common frame of reference.
The transformation matrix obtained in the above step is based on pairwise matching and likely to suffer from error accumulation. This can ultimately lead to misaligned views when a number of such scans are merged. Therefore, a global error minimization algorithm has to be used. The set of transformed point clouds are further refined by the global registration to get a new set of transformations and transformed point.
The registered point cloud is just a collection of discrete sample points on the continuous surface of the object. For proper visualization of the 3D object we generate the surface as a triangular mesh, which, is a piecewise linear approximation of the continuous surface. Poisson surface reconstruction is an implicit volumetric technique of surface reconstruction from a collection of 3D points. It fits a water tight surface to a given point cloud. The algorithm is highly resilient to noise, subject to accurate estimation of surface normals.
The results show that our scanning system is able to acquire minute details on the object surface and generate dense 3D data. The scanner operation is non-invasive i.e. an object can be scanned without touching or removing it from its present location. There is no need of pasting stickers or any kind of markers on the object to be scanned. Also placing the object on a turntable or similar apparatus is not required. This is especially important while safeguarding archaeological artefacts as they are often delicate and not admissible to any interference for their pristine preservation.

Short Biography:

Prabir Kumar Biswas received the B.Tech. (Honors) degree in Electronics and Electrical Communication Engineering, the M.Tech. degree in Automation and Control engineering, and the Ph.D. degree  in  Computer  Vision  from  the  Department  of  Electronics  and  Electrical  Communication Engineering,  Indian  Institute  of  Technology  (IIT),  Kharagpur,  India,  in  1985,  1989,  and  1991, respectively. From 1985 to 1987, he was with Bharat Electronics Ltd., Ghaziabad, India, as a Deputy Engineer. Since 1991, he has been working as a Faculty Member in the Department of Electronics and Electrical Communication Engineering, IIT Kharagpur, where he is currently a Professor. He visited the University of Kaiserslautern, Germany, under Alexander von Humboldt Research Fellowship from March  2002  to  February  2003.  He  has  more  than  100  research  publications  in  international and national journals and conferences and has filed seven international patents. His areas of interest are image processing, pattern recognition, computer vision, video compression, parallel and distributed processing, and computer networks.


Dr. Pradeep Atrey
State University of New York at Albany, U.S.A

Title: Security and Privacy Issues in Multimedia Systems 


In this talk, I will first highlight the security and privacy issues in multimedia systems used in various applications such as homeland security surveillance, social media, and medical imaging and then discuss some of my recent research contributions related to secure cloud-based multimedia analytics and editing, privacy-aware surveillance, and social networking. Finally, I will present the open research challenges in this area.

Short Biography:

Pradeep K. Atrey is an Associate Professor at the State University of New York, Albany, NY, USA. He is also an Adjunct Professor at University of Ottawa, Canada. Previously he was an Associate Professor at the University of Winnipeg, Canada. He received his Ph.D. in Computer Science from the National University of Singapore, M.S. in Software Systems and B.Tech. in Computer Science and Engineering from India. He was a Postdoctoral Researcher at the Multimedia Communications Research Laboratory, University of Ottawa, Canada. His current research interests are in the area of Security and Privacy with a focus on multimedia surveillance and privacy, multimedia security, secure-domain cloud-based large-scale multimedia analytics, and social media. He has authored/co-authored over 95 research articles at reputed ACM, IEEE, and Springer journals and conferences. His research has been funded by Canadian Govt. agencies NSERC and DFAIT, and by Govt. of Saudi Arabia. Dr. Atrey is on the editorial board of several journals including ACM Trans. on Multimedia Computing, Communications and Applications (TOMM), ETRI Journal and IEEE Communications Society Review Letters. He was also guest editor for Springer Multimedia Systems and Multimedia Tools and Applications journals. He has been associated with over 30 international conferences/workshops in various roles such as General Chair, Program Chair, Publicity Chair, Web Chair, Demo Chair and TPC Member. Dr. Atrey was a recipient of the ACM TOMM Associate Editor of the Year (2015), the IEEE Comm. Soc. MMTC Best R-Letter Editor Award (2015), the Erica and Arnold Rogers Award for Excellence in Research and Scholarship (2014), ETRI Journal Best Editor Award (2012), ETRI Journal Best Reviewer Award (2009) and the three University of Winnipeg Merit Awards for Exceptional Performance (2010, 2012 and 2013). He was also recognized as the ACM Multimedia Rising Star (2015), the ICME Outstanding Organizing Committee Member (as Publicity Chair) (2013) and the ICME Quality Reviewer (2011).


Prof. Santanu Chaudhury
Indian Institute of Technology – Delhi

Title: 3D Content Generation using  Uncalibrated Views


3D content is being used to-day in a variety of applications – 3DTV, Games, Visualisation, etc. In this talk we shall examine approaches developed for 3D content generation from one or more uncalibrated 2D views without explicitly going through a mechanism of traditional explicit 3D reconstruction. We shall see how computational algebra can be used to generate arbitrary 3D views from few images of the scene taken from general positions. Use of machine learning for generating 3D views from 2D images will be also examined. We shall illustrate applicability these techniques using some real-world videos.

Short Biography:

Santanu Chaudhury received the B.Tech. and Ph.D. degrees from the Indian Institute of Technology Kharagpur, in 1989 and 1984, respectively. He is currently the Schlumberger Chair Professor in the Department of Electrical Engineering IIT Delhi. His research interests are in the areas of multimedia information retrieval, document image analysis and artificial intelligence. Dr. Chaudhury was awarded the INSA medal for young scientists in 1993. He is a fellow of Indian National Academy of Engineers and National Academy of Sciences, India.

CVIP2016 [at]