Complex Object Management in Databases: About the Preparedness of Database Technology for New Emerging Applications”.
The quantity and nature of data has changed over the years. While the beginning of database technology was characterized by the handling of manageable volumes of alphanumerical data of simple structure, we are now confronted with “big data”. The phrase “big data” refers to large, diverse, complex, and/or distributed
data sets. Large data sets are generated, for example, from instruments, sensors, and satellites and usually have a simple internal structure. The focus of this talk is on the aspect of the complexity of big data. New emerging (that is, non-traditional) applications including biological, genomic, multimedia, digital library, imaging, scientific, location-based, geospatial, and spatiotemporal technologies have necessitated the handling of complex application objects. These objects are highly structured, large in size, and of variable representation length. Currently, such objects are handled by using scientific file formats like HDF and NetCDF, or by special, built-in data types in databases like XML and BLOB.
However, some of these approaches are very application specific and/or do not provide proper levels of data abstraction for users. Others do not support random updates or cannot manage large volumes of structured data and simultaneously provide associated high-level operations. In this talk, we consider the state of the art of complex object management in databases and analyze the requirements, solutions, and weaknesses of available approaches. Finally, we introduce our ongoing work on a novel two-step solution to managing and querying complex application objects within databases. The first step introduces a novel data type called Intelligent Binary Large Object (iBLOB) that leverages the traditional BLOB type in databases, preserves the structure of application objects, and provides smart query and update capabilities. The second step consists in a generalized conceptual framework to capture and validate the structure of application objects by means of a type structure specification (TSS). The iBLOB framework generates a type structure specific application programming interface that allows applications to easily access the components of complex application objects. This greatly simplifies the ease with which new type systems for complex application objects can be implemented inside database systems.
Markus Schneider is an Associate Professor at the Department of Computer and Information Science and Engineering of the University of Florida, which is located in Gainesville, Florida, USA. He holds an M.S. degree in Computer Science from the Technical University in Dortmund, Germany, and a Ph.D. degree in Computer Science from the University of Hagen, Germany. His research interests include spatial, spatio-temporal, and moving objects databases, spatial data warehousing and SOLAP, spatial information science, geoinformatics, geographical information systems, applied computational geometry, and extensible databases. He is the co-author of the bookMoving Objects Databases published by Morgan-Kaufmann, the author of the book Spatial Data Types for Database Systems published by Springer-Verlag, and the author of the book Implementation Concepts for Database Systems published by Springer-Verlag. Further, he has published more than 100 journal articles, book chapters, and conference papers. He is on the editorial board of the journal GeoInformatica and a recipient of the 2004 National Science Foundation (NSF) CAREER Award.