CVIP2016 – International Conference on Computer Vision & Image Processing

Prof. Bidyut Baran Chaudhuri
Indian Statistical Institute – Kolkata

Title: A Kalman filtering based data clustering approach and evaluation of clustering methods

Abstract:

Research on Clustering has gained momentum with recent interest in Data mining and Knowledge discovery problems. To obtain good clustering, heuristic optimization based methods have gained popularity. A Kalman filtering based heuristic approach called Heuristic Kalman Algorithm (HKA) was proposed a few years ago. Here we show that HKA can be efficiently employed in partitional data clustering. Next, we propose an improved approach named HKA-K that combines HKA and a step of K-means algorithm. It is shown that HKA-K is at least as good as or sometimes better than some other state of the art clustering algorithms. In the second part of the talk, we consider quality evaluation of any Clustering Algorithm (CA) with the benchmark data. For quality we use the accuracy of output clusters. Since a CA cannot label the clusters by class names, for N clusters, N! different sets of name labels can be generated. Using the data ground truth, the accuracy for each set of labels can be calculated and the highest value over all sets may be accepted as the clustering accuracy. Since O(N!) computation is prohibitively large, here we explore how this can be improved.

Short Biography:

Prof B B Chaudhuri is the INAE Distinguished Professor and J C Bose Fellow as Computer Vision and Pattern Recognition Unit of Indian Statistical Institute. His research interests include Pattern Recognition, Image Processing, Computer Vision, Natural Language Processing (NLP), Signal processing, Cognitive science, Data compression, Digital Document Processing etc. He pioneered the first Indian language Bharati Braille system for the blind, a successful Bangla speech synthesis system as well as the first workable OCR for Bangla, Devnagari, Assamese and Oriya scripts. In NLP area, a robust Indian language spell-checker, morphological processor, multi-word expression detector and statistical analyzer were pioneered by him. Some of his technologies have been transferred to industry for commercialization. He has published about 400 research papers in reputed international journals, conference proceedings and edited books. Also, he has authored/edited seven books entitled Two Tone Image Processing and Recognition (Wiley Eastern, 1993), Object Oriented Programming: Fundamentals and Applications (Prentice Hall, 1998), Computer and Software Technology Dictionary (Ananda Publishers, 2002), Digital Document Processing ( Springer-Verlag, 2007), Bangla Sound symbolism: Characteristics and Dictionary (Bangla Academy, 2011), Multimedia Information extraction and digital heritage preservation (World Scientific, 2011), Some Articles on Language Technology (Ananda Publishers, 2012), Advances in Digital Document Processing and Retrieval (World Scientific, 2014). He received Leverhulme fellowship award 1981-82, Sir J. C. Bose Memorial Award in 1986, M. N. Saha Memorial Award (twice) in 1989 and 1991, Homi Bhabha Fellowship award in 1992, Dr. Vikram Sarabhai Research Award in 1995, C. Achuta Menon Prize in 1996, Homi Bhabha Award: Applied Sciences in 2003, Ram Lal Wadhwa Gold Medal in 2005, Jawaharlal Nehru Fellowship during 2004-2006, J C Bose fellowship 2010, Om Prakash Vasin Award 2011. He is the associate editor of six international journals. Also he is a fellow of IETE, WBAST, INSA, NASc, IAPR, IEEE and The World academy of sciences (TWAS).

—

Prof. Christian Micheloni
University of Udine, Italy

Title: Fill the gap for non-continuous long term tracking

Abstract:

The recent advancement and the price plummet of imaging sensor technologies have remarkably increased the adoption of video analytics systems for various application ranging from home to border surveillance. However, it is a matter of fact that, due to the amount of human supervision, privacy concerns, and maintenance costs involved, it is still not possible to deploy camera networks covering all the areas of a wide environment. Thus, only a small portion of it is monitored by cameras which often have non-overlapping fields-of-view (FoVs). As a result, there exist blind areas from which no information can be directly obtained. This raises the need for methods able to link the information acquired between the covered areas such that high-level semantics can be obtained. Countless applications like multi-camera tracking, situational awareness, and multi-camera event detection would benefit from such methods. One of the most currently attractive issues that such blind areas have introduced is the problem of re-associating a same person that is moving in a wide environment and who might be detected at a different location and time. This is known as the person re-identification problem. The talk will cover the major issues in the person re-identification problem, the main strategies proposed in the state of the art, the most promising baseline approaches and a vision on which will be the advances in the near future.

Short Biography:

Christian Micheloni is an associate professor at the University of Udine. Since 2000 he has taken part to the European research being under contract for several European Projects. He has co-authored different scientific works published in International Journals and Refereed International Conferences. He has served as a reviewer for several International Journals and has been on the organizing and program committees of different international conferences. His main interests involve active vision for the wide area scene understanding, neural networks for the classification and recognition, resource aware camera networks to establish proper control protocols for improving cognition capabilities. He is also interested in pattern recognition and machine learning. Current research projects include camera network self-reconfiguration and person re-identification. He is member of the International Association of Pattern Recognition (IAPR) and member of the IEEE.

—

Prof. Prabir Kumar Biswas
Indian Institute of Technology – Kharagpur

Title: Creation and Processing of 3-D Images

Abstract:

Three dimensional image in the form of 3 D point cloud is of utmost importance in various applications involving three dimensional modelling of various objects. The problem has gained importance of different dimension with the awareness of preserving archaeological monuments. 3 D modelling not only helps making digital copies of the monuments and creation of virtual museums, it also helps carrying out accurate restoration whenever required.
In this talk I shall present a low cost 3 D scanner that is capable of capturing minute surface details using laser projection technique. The laser projector casts a plane of light onto the target object, which, intersects with the object to create an illuminated curve on it. The laser illuminated object is captured by the CCD camera. Using
ray-plane intersection, we can compute the 3D coordinates of all points in the camera image illuminated by the laser.
Due to self-occlusion and concavities in the scanned object, it is possible to scan only a part of the object from one point of view. Therefore, we need to perform multiple scans and combine scan data from multiple wide baseline viewpoints onto a common reference coordinate system. Moreover, due to unstructured environment, the pose and the position of the scans are not known beforehand and need to be generated from the information within the scan data itself. For N views, we get a set of point clouds P and a set of texture bitmaps. Key point extraction detects and describes a set of points from each of these views, which are invariant to rigid transformation. In order to ascertain this, key points are usually chosen at corners, blob-like structures, T-junctions, etc. In our implementation we have used Harris 3D corners to detect the key points in the point clouds, as it has been found to yield robust results. A key point needs to be described by a feature vector computed from its neighbourhood. A pair of views is registered by identifying common parts of the object observed in the corresponding point clouds and match at least 4 points to compute the rigid 3D transformation matrix. 3 D transformation matrix is used to transform the point clouds to a common frame of reference.
The transformation matrix obtained in the above step is based on pairwise matching and likely to suffer from error accumulation. This can ultimately lead to misaligned views when a number of such scans are merged. Therefore, a global error minimization algorithm has to be used. The set of transformed point clouds are further refined by the global registration to get a new set of transformations and transformed point.
The registered point cloud is just a collection of discrete sample points on the continuous surface of the object. For proper visualization of the 3D object we generate the surface as a triangular mesh, which, is a piecewise linear approximation of the continuous surface. Poisson surface reconstruction is an implicit volumetric technique of surface reconstruction from a collection of 3D points. It fits a water tight surface to a given point cloud. The algorithm is highly resilient to noise, subject to accurate estimation of surface normals.
The results show that our scanning system is able to acquire minute details on the object surface and generate dense 3D data. The scanner operation is non-invasive i.e. an object can be scanned without touching or removing it from its present location. There is no need of pasting stickers or any kind of markers on the object to be scanned. Also placing the object on a turntable or similar apparatus is not required. This is especially important while safeguarding archaeological artefacts as they are often delicate and not admissible to any interference for their pristine preservation.

Short Biography:

Prabir Kumar Biswas received the B.Tech. (Honors) degree in Electronics and Electrical Communication Engineering, the M.Tech. degree in Automation and Control engineering, and the Ph.D. degree in Computer Vision from the Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology (IIT), Kharagpur, India, in 1985, 1989, and 1991, respectively. From 1985 to 1987, he was with Bharat Electronics Ltd., Ghaziabad, India, as a Deputy Engineer. Since 1991, he has been working as a Faculty Member in the Department of Electronics and Electrical Communication Engineering, IIT Kharagpur, where he is currently a Professor. He visited the University of Kaiserslautern, Germany, under Alexander von Humboldt Research Fellowship from March 2002 to February 2003. He has more than 100 research publications in international and national journals and conferences and has filed seven international patents. His areas of interest are image processing, pattern recognition, computer vision, video compression, parallel and distributed processing, and computer networks.

—

Dr. Pradeep Atrey
State University of New York at Albany, U.S.A

Title: Security and Privacy Issues in Multimedia Systems

Abstract:

In this talk, I will first highlight the security and privacy issues in multimedia systems used in various applications such as homeland security surveillance, social media, and medical imaging and then discuss some of my recent research contributions related to secure cloud-based multimedia analytics and editing, privacy-aware surveillance, and social networking. Finally, I will present the open research challenges in this area.

Short Biography:

Pradeep K. Atrey is an Associate Professor at the State University of New York, Albany, NY, USA. He is also an Adjunct Professor at University of Ottawa, Canada. Previously he was an Associate Professor at the University of Winnipeg, Canada. He received his Ph.D. in Computer Science from the National University of Singapore, M.S. in Software Systems and B.Tech. in Computer Science and Engineering from India. He was a Postdoctoral Researcher at the Multimedia Communications Research Laboratory, University of Ottawa, Canada. His current research interests are in the area of Security and Privacy with a focus on multimedia surveillance and privacy, multimedia security, secure-domain cloud-based large-scale multimedia analytics, and social media. He has authored/co-authored over 95 research articles at reputed ACM, IEEE, and Springer journals and conferences. His research has been funded by Canadian Govt. agencies NSERC and DFAIT, and by Govt. of Saudi Arabia. Dr. Atrey is on the editorial board of several journals including ACM Trans. on Multimedia Computing, Communications and Applications (TOMM), ETRI Journal and IEEE Communications Society Review Letters. He was also guest editor for Springer Multimedia Systems and Multimedia Tools and Applications journals. He has been associated with over 30 international conferences/workshops in various roles such as General Chair, Program Chair, Publicity Chair, Web Chair, Demo Chair and TPC Member. Dr. Atrey was a recipient of the ACM TOMM Associate Editor of the Year (2015), the IEEE Comm. Soc. MMTC Best R-Letter Editor Award (2015), the Erica and Arnold Rogers Award for Excellence in Research and Scholarship (2014), ETRI Journal Best Editor Award (2012), ETRI Journal Best Reviewer Award (2009) and the three University of Winnipeg Merit Awards for Exceptional Performance (2010, 2012 and 2013). He was also recognized as the ACM Multimedia Rising Star (2015), the ICME Outstanding Organizing Committee Member (as Publicity Chair) (2013) and the ICME Quality Reviewer (2011).

—

Prof. Santanu Chaudhury
Indian Institute of Technology – Delhi

Title: 3D Content Generation using Uncalibrated Views

Abstract:

3D content is being used to-day in a variety of applications – 3DTV, Games, Visualisation, etc. In this talk we shall examine approaches developed for 3D content generation from one or more uncalibrated 2D views without explicitly going through a mechanism of traditional explicit 3D reconstruction. We shall see how computational algebra can be used to generate arbitrary 3D views from few images of the scene taken from general positions. Use of machine learning for generating 3D views from 2D images will be also examined. We shall illustrate applicability these techniques using some real-world videos.

Short Biography:

Santanu Chaudhury received the B.Tech. and Ph.D. degrees from the Indian Institute of Technology Kharagpur, in 1989 and 1984, respectively. He is currently the Schlumberger Chair Professor in the Department of Electrical Engineering IIT Delhi. His research interests are in the areas of multimedia information retrieval, document image analysis and artificial intelligence. Dr. Chaudhury was awarded the INSA medal for young scientists in 1993. He is a fellow of Indian National Academy of Engineers and National Academy of Sciences, India.

—

Contacts:
CVIP2016 [at] iitr.ac.in