Foundations of programming

Lecture seven - Data Structures

Work Plan

  1. Introduction, types of programming languages, pseudo-code
  2. Variables, data types
  3. Pointers and arrays
  4. Simple instructions and tricks
  5. Functions and procedures
  6. Debug
  7. Input / Output
  8. Data structures
  9. Operators, polymorphism, templates, namespace, C++14
  10. Python
  11. Threads, parallel programming
  12. Functional / object oriented programming
  13. Software development process, git, agile, tdd
  14. Final Test during last classes: about 30 open/close questions - mostly closed.

Structures

Data Structure is a way of collecting and organising data in such a way that we can perform operations on these data in an effective way.

image

Structure

A structure is a collection of variable types grouped together.

General data structure types include the array, the file, the record, the table, etc.

In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms.

Structure

The data in the data structures are processed by certain operations.

The particular data structure chosen largely depends on the frequency of the operation that needs to be performed on the data structure.

Traversing Searching Insertion Deletion Sorting Merging

In our languages operation will be functions that will be executed on data structures.

Data

Anything that can store data can be called as a data structure, hence Integer, Float, Boolean, Char etc, all are data structures.

They are known as Primitive Data Structures (or primitives).

Complex Data Structures like lists, sets, graphs, trees etc. are called Abstract Data Structure (or Derived data types).

Typedef

C++ allows the definition of our own types based on other existing data types. We can do this using the keyword typedef.

Definition:

typedef typ_do_skrocenia nowy_alias_do_typu;

Example

typedef unsigned long ulong;

typedef int (*funcptr)(char, unsigned long);
funcptr f;  // f is a ptr to a function taking char & unsigned long, returning an int

Every identifier introduced in this declaration becomes a typedef-name, which is a synonym for the type of the object or function that it would become if the keyword typedef were removed.

Struct

Structure is a collection of variables of different types under a single name.

Keyword struct is used for creating a structure.

struct structure_name 
{
    data_type member1;
    data_type member2;
    .
    .
    data_type memeber;
};

Example:

struct person
{
    char name[50];
    int citNo;
    float salary;
};

This declaration above creates the data type struct person. Note the semicolon at the end of declaration!

Initialization

Initializing value in the struct:

struct A {
    int i;
    double j;
};

A b = {};       // b.i is 0 and b.j is 0.0 
A c = {1};      // c.i is 1 and c.j is 0.0 
A d = {1, 2.0};

Or

struct A {
    int i = 0;
    double j = 0.0;
};

Struct

Declaration of variable of type struct:

struct person
{
    char name[50];
    int citNo;
    float salary;
};

int main()
{
    person person1, person2, person3[20];
    return 0;
}

or

struct person
{
    char name[50];
    int citNo;
    float salary;
} person1, person2, person3[20];

Access struct

There are two types of operators used for accessing members of a structure.

Member operator(.) Structure pointer operator(->)

Example

person2.salary

person2->salary

Pointer to Structure

Structures can be created and accessed using pointers. A pointer variable of a structure can be created as below:

struct name {
    member1;
    member2;
    .
    .
};

int main()
{
    name *ptr;
}

Using the pointers

In this example, the pointer variable of type struct person is referenced to the address of person1. Then, only the structure member through pointer can be accessed.

struct person
{   
    int age;
    float weight;
};  

int main()
{
    struct person *personPtr, person1;
    personPtr = &person1;            // Referencing pointer to memory address of person1

    printf("Enter integer: ");
    scanf("%d",&(*personPtr).age);

    printf("Displaying: ");
    printf("%d",(*personPtr).age);
}

Using the pointers

You cas use (->) operator to access dereferenced object in a memory.

struct person
{   
    int age;
    float weight;
};  

int main()
{
    struct person *personPtr, person1;
    personPtr = &person1;            // Referencing pointer to memory address of person1

    printf("Enter integer: ");
    scanf("%d",&personPtr->age);

    printf("Displaying: ");
    printf("%d",personPtr->age);
}

Recurrence

Structures can be nested within other structures in C programming.

Example:

struct complex
{
    int imag_value;
    float real_value;
};

struct number
{
    struct complex comp;
    int real;
} num1, num2;

Summary

The Dot (.) operator can't be applied to pointers.

*person.age() wouldn't work because Dot (.) operator is evaluated first.

person->age() is the same as (person).age()*

Struct types can be passed by value and reference to the function as any other types.

Note that Pair or Tuple from previous lectures are examples of structures.

Union

Union is a very similar type to struct. Defining a union is as easy as replacing the keyword struct with the keyword union.

union car
{
    char name[50];
    int price;
};

Every other element (access, pointers, references) is the same as in the case of the struct.

Struct vs Union

union unionJob
{
    //defining a union
    char name[32];
    float salary;
    int workerNo;
} uJob;

struct structJob
{
    char name[32];
    float salary;
    int workerNo;
} sJob;

int main()
{
    printf("size of union = %d", sizeof(uJob));
    printf("\nsize of structure = %d", sizeof(sJob));
    return 0;
}

Output:

size of union = 32
size of structure = 40

Struct vs Union

The amount of memory required to store a structure variable is the sum of memory size of all members.

The memory required to store a union variable is the memory required for the largest element of an union.

In the case of structure, all of its members can be accessed at any time.

In the case of union, only one of its members can be accessed at a time and all other members will contain garbage values if accessed.

We use Union in the case when you want to save memory space, for example in network protocols.

Standard Template Library

image

Standard Template Library

The Standard Template Library (STL) is a set of C++ template classes to provide common programming data structures and functions such as lists, stacks, arrays, etc.

Available only in C++ (uses classes and templates).

STL helps make code simpler as well as more powerful.

STL uses dynamic objects.

To work with STL put the

using namespace std;

before the main function main() the code.

Standard Template Library

The STL consists of three main components:

Container: an object that is able to keep and administer objects. Several containers are available - list, set, etc - but as far as possible they have the same member function names. For example, size(), empty(), begin(), end() and swap usually work whatever the container Algorithm: a computational procedure that is able to work on different containers Iterator: abstraction of "pointer" or "index" - a means to access the appropriate element.

Iterators

A container can be treated as a single entity, but you sometimes need a way to access individual items.

Iterators are generalised pointers to work with containers.

For a container "v", v.begin() returns an iterator pointing to the first element and v.end() "points" to one beyond the last.

Example (for a vector as a container).

for (vector<int>::iterator it = v.begin(); it != v.end(); it++) {
    cout << *it << " "; cout << endl; 
}

In C++11:

for (auto it = v.begin(); it != v.end(); it++) {
    cout << *it << " ";
}

Containers

Containers or container classes store objects and data.

Containers defines:
a state of an object operations that act on the object, possibly changing the state (like add, delete, insert etc.)
Two types of containers in STL:
sequence associative

Common functions

Operations common for all containers.
empty() - return true if container is empty size() - return number of elements in the container x.swap(y) - swap to elements of the container clear() - delete all elements from the container
Operators:
a = b - copy all elements of b into a a == b - return true if containers a and b are equal

Sequence containers

Every object has a specific position (index).

Examples:
vector queue list

Vector

Defined in header file <vector>.

Vectors are sequence containers representing dynamically allocated arrays.

Vectors are accessed as arrays.

Insertion of data is to the end of the vector.

Definition:

vector<int> a;

vector<float> b(10); //vector of given size

vector< vector<int> > matrixVector;

Compared to arrays, vectors consume more memory in exchange for the ability to manage storage and grow dynamically in an efficient way.

Vector

image

Vector

Operations:
push_back( el ) - Adds a new element el at the end of the vector, after its current last element. pop_back() - Removes the last element in the vector, effectively reducing the container size by one. insert( el ) - inserting new elements before the element at the specified position (unefficient). erase(..) - Removes from the vector either a single element (position) or a range of elements ([first,last)) - unefficient.

Vectors are very efficient accessing its elements (just like arrays) and relatively efficient adding or removing elements from its end.

For operations that involve inserting or removing elements at positions other than the end, they perform worse than the others.

Vector

Example:

vector<int> myvector;

// set some values (from 1 to 11)
for (int i=1; i<=11; i++) myvector.push_back(i);

myvector.pop_back();

// erase the 6th element
myvector.erase (myvector.begin()+5);

// erase the first 3 elements:
myvector.erase (myvector.begin(),myvector.begin()+3);

it = myvector.begin();
myvector.insert ( it , 200 );

for (unsigned i=0; i<myvector.size(); ++i)
cout << ' ' << myvector[i];

Output:

200 4 5 7 8 9 10

List

Defined in header file <list>.

Lists are sequence containers that allow constant time insert and erase operations anywhere within the sequence, and iteration in both directions.

List containers are implemented as doubly-linked lists.

Definition:

list<int> l = { 7, 5, 16, 8 };

list<int> l(20); // list of a given size

List

image

List

Operations:
push_back( el ) - Adds a new element el at the end of the list, after its current last element. pop_back() - Removes the last element in the list, effectively reducing the container size by one. push_front( el ) - Inserts a new element at the beginning of the list, right before its current first element. pop_front() - Removes the first element in the list container, effectively reducing its size by one. insert(el ) - Inserting new elements before the element at the specified position (efficient). erase(..) - Removes from the vector either a single element (position) or a range of elements ([first,last)) - efficient. reverse() - Reverses the order of the elements in the list container.

Compared to other base standard sequence containers (array, vector and deque), lists perform generally better in inserting, extracting and moving elements in any position within the container, also for sorting.

The main drawback of lists and forward_lists compared to these other sequence containers is that they lack direct access to the elements by their position;

For example, to access the sixth element in a list, one has to iterate from a known position (like the beginning or the end) to that position.

List

Example:

int main()
{
    list<int> l{1,2,3,4,5};

    l.pop_back();
    l.push_back(7);
    /* now the list becomes 1,2,3,4,7 */

    l.pop_front();
    l.push_front(9);
    /* now the list becomes 9,2,3,4,7 */

    l.reverse();
    /* now the list becomes 7,4,3,2,9 */
}

Queue and Stack

Queue is defined in header file <queue>.

Insertion always takes place at the back of the queue and deletion is always performed at the front of the queue.

Queue is defined in header file <stack>.

Insertion and deletion is always performed at the top of the stack.

Definition:

stack<int> s;

queue<int> q;

Queue and Stack

image

Queue and Stack

Operations:
push( el ) - Adds a new element el at the top of the stack / queue. pop() - Removes the top element from the stack / queue. top() - Returns the topmost element of the stack. front() - Returns the front element of the queue back() - Returns the element at the back of the queue

Queue and Stack

Example:

int main ()
{
    queue<int> myqueue;

    myqueue.push(77);
    myqueue.push(16);

    myqueue.front() -= 6;   

    cout << myqueue.front() << '\n';

    stack<int> mystack;

    mystack.push(10);
    mystack.top() -= 5;
    mystack.push(20);
    mystack.pop();

    cout << mystack.top() 

    return 0;
}

Output:

10
15

Priority queue

Priority_queue is just like a normal queue except the element removed from the queue is always the greatest among all the elements in the queue (like in a heap).

Elements can be inserted at any order and it have O(log(n)) time complexity for insertion.

Example:

priority_queue<int> pq1;

pq1.push(30);  // inserts 30 to pq1 , now top = 30
pq1.push(40);  // inserts 40 to pq1 , now top = 40 ( maxinmum element)
pq1.push(90);  // inserts 90 to pq1 , now top = 90  
pq1.push(60);   // inserts 60 to pq1 , top still is 90  

pq1.pop();  // removes 90 ( greatest element in the queue 

Associative containers

Offers O(logn) insertion, deletion and access.

Stores only ordered types (with defined comparator).

They may use hashing to store data.

Set

Defined in header file <set>.

Sets are containers that store unique elements following a specific order.

The value of the elements in a set cannot be modified once in the container (the elements are always const), but they can be inserted or removed from the container.

Sets are usually implemented as red-black trees (self-balancing trees).

Definition:

set<int> a;

int myints[]= {10,20,30,40,50};
set<int> second (myints,myints+5); 

Set

image

Set

Operations:
count( el ) - Searches the container for elements equivalent to val and returns the number of matches. find( el ) - Searches the container for an element equivalent to val and returns an iterator to it if found, otherwise it returns an iterator to set::end. insert( el ) - Extends the container by inserting new elements (if element is not already in the set). erase(..) - Removes from the vector either a single element (position) or a range of elements ([first,last)).

Operations are done in O(log n) time.

Set

Example:

set<int> myset;
set<int>::iterator it;

// set some initial values:
for (int i=1; i<=5; i++) myset.insert(i*10);    // set: 10 20 30 40 50

it=myset.find(20);
myset.erase(it);
myset.erase(myset.find(40));

for (it=myset.begin(); it!=myset.end(); ++it)
cout << ' ' << *it;

Output:

10 30 50

Multiset

Defined in header file <multiset>.

Multiset is an associative container that contains a sorted set of objects of type Key. Unlike set, multiple keys with equivalent values are allowed.

Multiset uses the same functions as the set.

Multisets are typically implemented as binary search trees.

Map

Defined in header file <map>.

Maps contain sorted key-value pair, in which each key is unique and cannot be changed, and it can be inserted or deleted but cannot be altered.

Value associated with keys can be altered.

Maps are typically implemented as balanced binary search trees.

Definition:

map<key_type , value_type> map_name;

map<int,int> m{ {1,2} , {2,3} , {3,4} };

map<string,int> map1;

Key of a map and corresponding values are always inserted as a pair, you cannot insert only key or just a value in a map.

Map

image

HashMap

image

Map

Operations:
[ key ] - Assign, change and access value for given key. insert( el ) - Insert entries in the map, if the key is already present in the map insertion is aborted. erase( it ) - Removes the entry from the map pointed by the iterator. find( el ) - Searches the container for an element with a key equivalent to k and returns an iterator to it if found, otherwise it returns an iterator to map::end.

The mapped values in a map can be accessed directly by their corresponding key using the bracket operator (operator[]).

Unordered map is an associative container that contains key-value pairs with unique keys. Search, insertion, and removal of elements have average constant-time complexity.

Hash_map (unorder_map)

Example:

int main ()
{
    unordered_map<int, string> dict = {{1, "one"}, {2, "two"}};
    dict.insert({3, "three"});
    dict.insert(make_pair(4, "four"));
    dict.insert({{4, "another four"}, {5, "five"}});

    dict[1] = "none";
    dict[6] = "six";

    cout << dict[1] << " " << dict.find(6)->first << " " << dict.find(6)->second << " " << dict[7] << "\n";
}

Output:

none 6 six

Algorithms

Algorithms in the STL are procedures that are applied to the containers to process their data.

Examples:
Sorting Searching Partition

Sort

Sorts the elements in the range [first,last) into ascending order.

Equivalent elements are not guaranteed to keep their original relative order.

This function of the STL, sorts the contents of the given range. There are two version of sort() :
sort(start_iterator, end_iterator ) - sorts the range defined by iterators start_iterator and end_iterator in ascending order. sort(start_iterator, end_iterator, compare_function) - this also sorts the given range but you can define how the sorting should be done by compare_function.
int arr[5] = {1,5,8,4,2};

sort(arr , arr+5); 

Sort algorithm is required to be using approximately N log N comparisons (different implementations possible).

Find

lower_bound() - Returns an iterator pointing to the first element in the range [first,last) which does not compare less than val.

upper_bound() - Returns an iterator pointing to the first element in the range [first,last) which compares greater than val.

binary_search() - Returns true if any element in the range [first,last) is equivalent to val, and false otherwise.

int main () {
    int myints[] = {10,20,30,30,20,10,10,20};
    vector<int> v(myints,myints+8);           // 10 20 30 30 20 10 10 20

    sort (v.begin(), v.end());                // 10 10 10 20 20 20 30 30

    vector<int>::iterator low,up;
    low=lower_bound (v.begin(), v.end(), 20); //          ^
    up= upper_bound (v.begin(), v.end(), 20); //                   ^

    cout << "lower_bound at position " << (low- v.begin()) << '\n';
    cout << "upper_bound at position " << (up - v.begin()) << '\n';

    return 0;
}

Output:

lower_bound at position 3
upper_bound at position 6

Uses binary search concept which result in approxmatelly log2(N)+1 element comparisons.

Partition

Partition - Rearranges the elements from the range [first,last), in such a way that all the elements for which pred returns true precede all those for which it returns false. The iterator returned points to the first element of the second group.

Example:

int main () {
    vector<int> myvector;

    // set some values:
    for (int i=1; i<10; ++i) myvector.push_back(i); // 1 2 3 4 5 6 7 8 9

    vector<int>::iterator bound;
    bound = partition (myvector.begin(), myvector.end(), IsOdd);

    // print out content:
    cout << "odd elements:";
    for (vector<int>::iterator it=myvector.begin(); it!=bound; ++it)
    cout << ' ' << *it;
    cout << '\n';

    cout << "even elements:";
    for (vector<int>::iterator it=bound; it!=myvector.end(); ++it)
    cout << ' ' << *it;
    cout << '\n';

    return 0;
}

image

Downsides of using STL

Error messages involving templates tend to be very long and difficult to decipher.

The concept of iterators as implemented by STL can be difficult to understand at first: for example, if a value pointed to by the iterator is deleted, the iterator itself is then no longer valid.

The set of algorithms is not complete.

Summary

Time Complexity is a way to represent the amount of time needed by the program to run till its completion as a function of input size.

image

Summary

Characterstic Description
Linear In Linear data structures,the data items are arranged in a linear sequence. Example: Array.
Non-Linear In Non-Linear data structures,the data items are not in sequence. Example: Tree, Graph.
Homogeneous In homogeneous data structures,all the elements are of same type. Example: Array.
Non-Homogeneous In Non-Homogeneous data structure, the elements may or may not be of the same type. Example: Struct.
Static Static data structures are those whose sizes and structures associated memory locations are fixed, at compile time. Example: Array.
Dynamic Dynamic structures are those which expands or shrinks depending upon the program need and its execution. Also, their associated memory locations changes. Example: Linked List created using pointers.

References

http://www.studytonight.com/data-structures/introduction-to-data-structures

https://www.programiz.com/c-programming/c-structures

https://blog.petrzemek.net/2015/10/03/structures-and-member-initializers-in-cpp/

www-h.eng.cam.ac.uk/help/tpl/talks/C++.html

http://slideplayer.com/slide/8227815/

https://www.slideshare.net/SreejithSree1/stl-25264696

http://www.cplusplus.com/

http://www.studytonight.com/cpp/stl