Skip to Content

The Division of Information Technology

Big Data

Big data can be defined as both an object and an action.

As an object, big data is any type of data produced in significant quantities at significant rates, and which may have varying degrees of accuracy or truth to it. These characteristics are referred to as the "4 V's" of big data: variety, volume, velocity, and veracity.

As an action or process, big data entails collecting these rapidly generated, massive and varied data sets so that they can be analyzed to discover insights.

The variety, volume, velocity, and veracity of big data have presented new issues for its management, especially with respect to data storage and data processing. The massive amount of data being created now requires new ways to store it. These storage solutions can hold far more data than the traditional enterprise storage systems like data warehouses. Whereas data warehouses can store up to terabytes of data, new hyperscale storage solutions can store up to petabytes of data and more. In addition to storage issues, new software frameworks have been developed to help process the varied data. Examples of these storage solutions and software frameworks include database systems like Cassandra and NoSQL databases, and frameworks like Hadoop®.