Lecture seven - Data Structures
Data Structure is a way of collecting and organising data in such a way that we can perform operations on these data in an effective way.
A structure is a collection of variable types grouped together.
General data structure types include the array, the file, the record, the table, etc.
In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms.
The data in the data structures are processed by certain operations.
The particular data structure chosen largely depends on the frequency of the operation that needs to be performed on the data structure.
Traversing
Searching
Insertion
Deletion
Sorting
Merging
In our languages operation will be functions that will be executed on data structures.
Anything that can store data can be called as a data structure, hence Integer, Float, Boolean, Char etc, all are data structures.
They are known as Primitive Data Structures (or primitives).
Complex Data Structures like lists, sets, graphs, trees etc. are called Abstract Data Structure (or Derived data types).
C++ allows the definition of our own types based on other existing data types. We can do this using the keyword typedef.
Definition:
typedef type alias_for_the_type;
Example
typedef unsigned long ulong;
typedef int (*funcptr)(char, unsigned long);
funcptr f; // f is a ptr to a function taking char & unsigned long, returning an int
Every identifier introduced in this declaration becomes a typedef-name, which is a synonym for the type of the object or function that it would become if the keyword typedef were removed.
Structure is a collection of variables of different types under a single name.
Keyword struct is used for creating a structure.
struct structure_name
{
data_type member1;
data_type member2;
.
.
data_type memeber;
};
Example:
struct person
{
char name[50];
int citNo;
float salary;
};
This declaration above creates the data type struct person. Note the semicolon at the end of declaration!
Initializing value in the struct:
struct A {
int i;
double j;
};
A b = {}; // b.i is 0 and b.j is 0.0
A c = {1}; // c.i is 1 and c.j is 0.0
A d = {1, 2.0};
Or
struct A {
int i = 0;
double j = 0.0;
};
Declaration of variable of type struct:
struct person
{
char name[50];
int citNo;
float salary;
};
int main()
{
person person1, person2, person3[20];
return 0;
}
or
struct person
{
char name[50];
int citNo;
float salary;
} person1, person2, person3[20];
There are two types of operators used for accessing members of a structure.
Member operator(.)
Structure pointer operator(->)
Example
person2.salary
person2->salary
Structures can be created and accessed using pointers. A pointer variable of a structure can be created as below:
struct name {
member1;
member2;
.
.
};
int main()
{
name *ptr;
}
In this example, the pointer variable of type struct person is referenced to the address of person1.
struct person
{
int age;
float weight;
};
int main()
{
struct person *personPtr, person1;
personPtr = &person1; // Referencing pointer to memory address of person1
printf("Enter integer: ");
scanf("%d",&(*personPtr).age);
printf("Displaying: ");
printf("%d",(*personPtr).age);
}
You cas use (->) operator to access dereferenced object in a memory.
struct person
{
int age;
float weight;
};
int main()
{
struct person *personPtr, person1;
personPtr = &person1; // Referencing pointer to memory address of person1
printf("Enter integer: ");
scanf("%d",&personPtr->age);
printf("Displaying: ");
printf("%d",personPtr->age);
}
Structures can be nested within other structures in C programming.
Example:
struct complex
{
int imag_value;
float real_value;
};
struct number
{
struct complex comp;
int real;
} num1, num2;
The Dot (.) operator can't be applied to pointers.
*person.age() wouldn't work because Dot (.) operator is evaluated first.
person->age() is the same as (*person).age()
Struct types can be passed by value and reference to the function as any other types.
Note that Pair or Tuple from previous lectures are examples of structures.
Union is a very similar type to struct. Defining a union is as easy as replacing the keyword struct with the keyword union.
union car
{
char name[50];
int price;
};
Every other element (access, pointers, references) is the same as in the case of the struct.
union unionJob
{
//defining a union
char name[32];
float salary;
int workerNo;
} uJob;
struct structJob
{
char name[32];
float salary;
int workerNo;
} sJob;
int main()
{
printf("size of union = %d", sizeof(uJob));
printf("\nsize of structure = %d", sizeof(sJob));
return 0;
}
Output:
size of union = 32
size of structure = 40
The amount of memory required to store a structure variable is the sum of memory size of all members.
The memory required to store a union variable is the memory required for the largest element of an union.
In the case of structure, all of its members can be accessed at any time.
In the case of union, only one of its members can be accessed at a time and all other members will contain garbage values if accessed.
We use Union in the case when you want to save memory space, for example in network protocols.
The Standard Template Library (STL) is a set of C++ template classes to provide common programming data structures and functions such as lists, stacks, arrays, etc.
Available only in C++ (uses classes and templates).
STL helps make code simpler as well as more powerful.
STL uses dynamic objects.
To work with STL put the
using namespace std;
before the main function main() the code.
The STL consists of three main components:
Container: an object that is able to keep and administer objects. Several containers are available - list, set, etc - but as far as possible they have the same member function names. For example, size(), empty(), begin(), end() and swap usually work whatever the container
Algorithm: a computational procedure that is able to work on different containers
Iterator: abstraction of "pointer" or "index" - a means to access the appropriate element.
A container can be treated as a single entity, but you sometimes need a way to access individual items.
Iterators are generalised pointers to work with containers.
For a container "v", v.begin() returns an iterator pointing to the first element and v.end() "points" to one beyond the last.
Example (for a vector as a container).
for (vector<int>::iterator it = v.begin(); it != v.end(); it++) {
cout << *it << " "; cout << endl;
}
In C++11:
for (auto it = v.begin(); it != v.end(); it++) {
cout << *it << " ";
}
Containers or container classes store objects and data.
Every object has a specific position (index).
Defined in header file <vector>.
Vectors are sequence containers representing dynamically allocated arrays.
Vectors are accessed as arrays.
Insertion of data is to the end of the vector.
Definition:
vector<int> a;
vector<float> b(10); //vector of given size
vector< vector<int> > matrixVector;
Compared to arrays, vectors consume more memory in exchange for the ability to manage storage and grow dynamically in an efficient way.
Vectors are very efficient accessing its elements (just like arrays) and relatively efficient adding or removing elements from its end.
For operations that involve inserting or removing elements at positions other than the end, they perform worse than the others.
Example:
vector<int> myvector;
// set some values (from 1 to 11)
for (int i=1; i<=11; i++) myvector.push_back(i);
myvector.pop_back();
// erase the 6th element
myvector.erase (myvector.begin()+5);
// erase the first 3 elements:
myvector.erase (myvector.begin(),myvector.begin()+3);
it = myvector.begin();
myvector.insert ( it , 200 );
for (unsigned i=0; i<myvector.size(); ++i)
cout << ' ' << myvector[i];
Output:
200 4 5 7 8 9 10
Defined in header file <list>.
Lists are sequence containers that allow constant time insert and erase operations anywhere within the sequence, and iteration in both directions.
List containers are implemented as doubly-linked lists.
Definition:
list<int> l = { 7, 5, 16, 8 };
list<int> l(20); // list of a given size
Compared to other base standard sequence containers (array, vector and deque), lists perform generally better in inserting, extracting and moving elements in any position within the container, also for sorting.
The main drawback of lists and forward_lists compared to these other sequence containers is that they lack direct access to the elements by their position;
For example, to access the sixth element in a list, one has to iterate from a known position (like the beginning or the end) to that position.
Example:
int main()
{
list<int> l{1,2,3,4,5};
l.pop_back();
l.push_back(7);
/* now the list becomes 1,2,3,4,7 */
l.pop_front();
l.push_front(9);
/* now the list becomes 9,2,3,4,7 */
l.reverse();
/* now the list becomes 7,4,3,2,9 */
}
Queue is defined in header file <queue>.
Insertion always takes place at the back of the queue and deletion is always performed at the front of the queue.
Queue is defined in header file <stack>.
Insertion and deletion is always performed at the top of the stack.
Definition:
stack<int> s;
queue<int> q;
Example:
int main ()
{
queue<int> myqueue;
myqueue.push(77);
myqueue.push(16);
myqueue.front() -= 6;
cout << myqueue.front() << '\n';
stack<int> mystack;
mystack.push(10);
mystack.top() -= 5;
mystack.push(20);
mystack.pop();
cout << mystack.top()
return 0;
}
Output:
10
15
Priority_queue is just like a normal queue except the element removed from the queue is always the greatest among all the elements in the queue (like in a heap).
Elements can be inserted at any order and it have O(log(n)) time complexity for insertion.
Example:
priority_queue<int> pq1;
pq1.push(30); // inserts 30 to pq1 , now top = 30
pq1.push(40); // inserts 40 to pq1 , now top = 40 ( maxinmum element)
pq1.push(90); // inserts 90 to pq1 , now top = 90
pq1.push(60); // inserts 60 to pq1 , top still is 90
pq1.pop(); // removes 90 ( greatest element in the queue
Offers O(logn) insertion, deletion and access.
Stores only ordered types (with defined comparator).
They may use hashing to store data.
Defined in header file <set>.
Sets are containers that store unique elements following a specific order.
The value of the elements in a set cannot be modified once in the container (the elements are always const), but they can be inserted or removed from the container.
Sets are usually implemented as red-black trees (self-balancing trees).
Definition:
set<int> a;
int myints[]= {10,20,30,40,50};
set<int> second (myints,myints+5);
Operations are done in O(log n) time.
Example:
set<int> myset;
set<int>::iterator it;
// set some initial values:
for (int i=1; i<=5; i++) myset.insert(i*10); // set: 10 20 30 40 50
it=myset.find(20);
myset.erase(it);
myset.erase(myset.find(40));
for (it=myset.begin(); it!=myset.end(); ++it)
cout << ' ' << *it;
Output:
10 30 50
Defined in header file <multiset>.
Multiset is an associative container that contains a sorted set of objects of type Key. Unlike set, multiple keys with equivalent values are allowed.
Multiset uses the same functions as the set.
Multisets are typically implemented as binary search trees.
Defined in header file <map>.
Maps contain sorted key-value pair, in which each key is unique and cannot be changed, and it can be inserted or deleted but cannot be altered.
Value associated with keys can be altered.
Maps are typically implemented as balanced binary search trees.
Definition:
map<key_type , value_type> map_name;
map<int,int> m{ {1,2} , {2,3} , {3,4} };
map<string,int> map1;
Key of a map and corresponding values are always inserted as a pair, you cannot insert only key or just a value in a map.
The mapped values in a map can be accessed directly by their corresponding key using the bracket operator (operator[]).
Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity.
Example:
int main ()
{
unordered_map<int, string> dict = {{1, "one"}, {2, "two"}};
dict.insert({3, "three"});
dict.insert(make_pair(4, "four"));
dict.insert({{4, "another four"}, {5, "five"}});
dict[1] = "none";
dict[6] = "six";
cout << dict[1] << " " << dict.find(6)->first << " " << dict.find(6)->second << " " << dict[7] << "\n";
}
Output:
none 6 six
Algorithms in the STL are procedures that are applied to the containers to process their data.
Sorts the elements in the range [first,last) into ascending order.
Equivalent elements are not guaranteed to keep their original relative order.
int arr[5] = {1,5,8,4,2};
sort(arr , arr+5);
Sort algorithm is required to be using approximately N log N comparisons (different implementations possible).
lower_bound() - Returns an iterator pointing to the first element in the range [first,last) which does not compare less than val.
upper_bound() - Returns an iterator pointing to the first element in the range [first,last) which compares greater than val.
binary_search() - Returns true if any element in the range [first,last) is equivalent to val, and false otherwise.
int main () {
int myints[] = {10,20,30,30,20,10,10,20};
vector<int> v(myints,myints+8); // 10 20 30 30 20 10 10 20
sort (v.begin(), v.end()); // 10 10 10 20 20 20 30 30
vector<int>::iterator low,up;
low=lower_bound (v.begin(), v.end(), 20); // ^
up= upper_bound (v.begin(), v.end(), 20); // ^
cout << "lower_bound at position " << (low- v.begin()) << '\n';
cout << "upper_bound at position " << (up - v.begin()) << '\n';
return 0;
}
Output:
lower_bound at position 3
upper_bound at position 6
Uses binary search concept which result in approxmatelly log2(N)+1 element comparisons.
Partition - Rearranges the elements from the range [first,last), in such a way that all the elements for which pred returns true precede all those for which it returns false. The iterator returned points to the first element of the second group.
Example:
int main () {
vector<int> myvector;
// set some values:
for (int i=1; i<10; ++i) myvector.push_back(i); // 1 2 3 4 5 6 7 8 9
vector<int>::iterator bound;
bound = partition (myvector.begin(), myvector.end(), IsOdd);
// print out content:
cout << "odd elements:";
for (vector<int>::iterator it=myvector.begin(); it!=bound; ++it)
cout << ' ' << *it;
cout << '\n';
cout << "even elements:";
for (vector<int>::iterator it=bound; it!=myvector.end(); ++it)
cout << ' ' << *it;
cout << '\n';
return 0;
}
Error messages involving templates tend to be very long and difficult to decipher.
The concept of iterators as implemented by STL can be difficult to understand at first: for example, if a value pointed to by the iterator is deleted, the iterator itself is then no longer valid.
The set of algorithms is not complete.
Time Complexity is a way to represent the amount of time needed by the program to run till its completion as a function of input size.
Characterstic | Description |
Linear | In Linear data structures,the data items are arranged in a linear sequence. Example: Array. |
Non-Linear | In Non-Linear data structures,the data items are not in sequence. Example: Tree, Graph. |
Homogeneous | In homogeneous data structures,all the elements are of same type. Example: Array. |
Non-Homogeneous | In Non-Homogeneous data structure, the elements may or may not be of the same type. Example: Struct. |
Static | Static data structures are those whose sizes and structures associated memory locations are fixed, at compile time. Example: Array. |
Dynamic | Dynamic structures are those which expands or shrinks depending upon the program need and its execution. Also, their associated memory locations changes. Example: Linked List created using pointers. |
http://www.studytonight.com/data-structures/introduction-to-data-structures
https://www.programiz.com/c-programming/c-structures
https://blog.petrzemek.net/2015/10/03/structures-and-member-initializers-in-cpp/
www-h.eng.cam.ac.uk/help/tpl/talks/C++.html
http://slideplayer.com/slide/8227815/