Analysis of python data structure algorithms

Previously on Learning:

python datatypes.python data structures: data types.
Python input and output.python data structure input and output and control and exception.
python object-oriented.python data structures object oriented.

Today we come to learn the content of the interview questions are to avoid not a small problem, is the algorithm analysis, what is the algorithm analysis, algorithm analysis is used to analyze the good and bad of an algorithm, everyone to complete a thing to write different algorithms, you need to algorithm analysis to judge the algorithm of the good and bad, the most common is the complexity of the procedure of O(n).

1. Definition of algorithmic analysis

There's this question:Are there advantages and disadvantages when two seemingly different programs solve the same problem? The answer is yes. Algorithm analysis is concerned with comparing algorithms based on the computational resources used. We say that Algorithm A is better than Algorithm B based on the fact that Algorithm A has higher resource utilization or uses fewer resources. From this point of view, the two functions above are actually similar in that they both essentially utilize the same algorithm to solve the accumulation problem.

What exactly does computing resource mean? It is important to think about this question. There are two ways to think about it.

One is to consider the amount of space or memory that the algorithm is going to take up in solving the problem. The total amount of space required by the solution is generally determined by the problem instance itself, but algorithms often have specific space requirements as well.
The second is to analyze and compare algorithms based on the time they take to execute. This metric is sometimes called the execution time or runtime of the algorithm. To measuresumOfN One way to get the execution time of a function is to do a benchmark analysis. That is, we keep track of the actual time it takes for the program to compute the result. In Python, we record the start time and the end time of a function with respect to the system it's on.time The module has a time function that returns the current system clock time in seconds from a specified point in time. This function is called once at the beginning and once at the end, and the difference is calculated to get the execution time in seconds.

An example:We need to solve for the sum of the first n numbers, and evaluate the efficiency by calculating the time required. (Here we use the time function, and calculate it 5 times to see how much time it takes roughly)

First approach: circular programs

import time
def sumOfN2(n): 
        start=()
        thesum=0
        for i in range(1,n+1):
            thesum=thesum+i
        end=()
        return thesum,end-start
#Cycle 5 times
for i in range(5):
     print("Sum is %d required %10.7f seconds" % sumOfN2(10000))

The results are as follows:

Second method: the formulaic approach

# Direct use of the summation formula
def sumOfN3(n): 
        start=()
        thesum=(1+n)*n/2
        end=()
        return thesum,end-start
for i in range(5):
     print("Sum is %d required %10.7f seconds" % sumOfN3(10000))

The results are as follows:

Intuitively, the circular scheme looks like more work because some steps are repeated. This seems to be the reason why it takes longer. Also, the time taken for the cyclic scheme grows with n. However, there is a problem here. If you run this function on another computer, or implement it in another programming language, you will probably get different results. If the computer were older, thesumOfN3 The implementation of the program takes even longer.
We need a better way to characterize the execution time of an algorithm. Benchmarking calculates the actual time it takes to execute an algorithm. This is not a useful metric because it depends on the particular computer, program, time, compiler and programming language. We would like to find a metric that is independent of the program or computer. Such a metric would be more useful in evaluating algorithms and could be used to compare algorithms from different implementations.

2. Big O notation

Here to give you an idea of the growth rate of some of the functions, I decided to list some of the function's.

Example: Calculate the number of steps in the following program, and the order of magnitude greater O

a = 5
b = 6
c = 10
for i in range(n): 
    for j in range(n): 
        x = i * i 
        y = j * j 
        z = i * j 
for k in range(n): 
    w = a * k + 45  
    v = b * b
d = 33

The number of assignments in this program is:

You can do the math yourself.

3. Big O notation for different algorithms

Here we use different algorithms to implement a classic heteroglossic word detection ask, the so-called heteroglossic words, that is, the letters that make up the word are the same, just in a different order, for exampleheartrespond in singingearth，pythonrespond in singingtyphon. To simplify the problem, we assume that the two word strings to be tested are already the same length.

3.1 Inventory method

This method essentially counts each character in the 1st string to see if they both appear in the 2nd string. If so, then the two strings must be disjuncts. Counting is accomplished by using thePython This is accomplished by replacing the character with the special value None in Python. However, because strings in Python are immutable, you first convert the second string into a list. You check the list of characters for each character in the first string, and if you find one, you replace it.

def anagramSolution1(s1, s2):
    alist = list(s2)
    pos1=0
    stillOK = True
    while pos1 < len(s1) and stillOK:
          pos2 = 0
          found = False
          while pos2 < len(alist) and not found:
                if s1[pos1] == alist[pos2]:
                   found = True
                else:
                   pos2 = pos2 + 1
          if found:
                alist[pos2] = None
          else:
                stillOK = False
          pos1 = pos1 + 1
    return stillOK

to analyze this algorithm. Note that for each of the n characters in s1, the n characters in s2 are traversed when each is examined. To match a character in s1, each of the n positions in the list is visited once. Therefore, the number of visits becomes a sum of integers from 1 to n. This can be expressed by the following equation. This can be expressed by the following formula.

Therefore, the time complexity of the method is

3.2 Sequencing

Although s1 and s2 are different strings, they are heterographs as long as they consist of the same characters. Based on this, another scheme can be used. If the characters are sorted in alphabetical order, the result of the disjunction will be the same string.

def anagramSolution2(s1, s2):
       alist1 = list(s1)
       alist2 = list(s2)
       ()
       ()
       pos=0
       matches = True
       while pos < len(s1) and matches:
             if alist1[pos] == alist2[pos]:
                pos = pos + 1
             else:
                matches = False
      return matches

At first glance, you might think that the time complexity of this algorithm is O ( n ) O(n) O(n) because you only need to traverse it once to compare n characters after sorting. However, calling the sort method twice is not without cost. As we will see later, the time complexity of sort is essentially O(n 2 ) O(n2 )O(n2 ) or O(nlog n ) O(nlogn)O(nlogn), so the sorting operation plays a dominant role. That is, the algorithm is of the same order of magnitude as the sorting process.

3.3 Brute force method

The brute force approach to the problem is basically to exhaust all possibilities. In the case of the heteroscedastic word detection problem, you can use the characters in s1 to generate all possible strings to see if s2 is among them. But there is a difficulty with this approach. When generating all possible strings using the characters in s1, there are n possibilities for the first character, n-1 possibilities for the second character, n-2 possibilities for the third character, and so on. The total number of strings is n ∗ ( n - 1 ) ∗ ( n - 2 ) ∗ . . . . . . ∗ 3 ∗ 2 ∗ 1 n∗(n-1)∗(n-2)* ...... ∗3∗2∗1n∗(n-1)∗(n-2)∗...... ∗3∗2∗1,which is n ! n!n!Perhaps some strings will be repeated, but the program cannot foresee this, so it will certainly generate n ! n!n!strings.
When n is large, n! grows faster than 2n. In fact, if s1 has 20 characters, then the number of strings is 20!= 2432902008176640000 . Assuming one per second, it would take 77146816596 years to process the entire list. This is not a good solution.

3.4 Counting method

This last scheme is based on the fact that two dissimilar words have the same number of a's, the same number of b's, the same number of c's, and so on. To determine whether two strings are dissimilar words, count the number of times each character occurs. Since there may be 26 types of characters, 26 counters are used, corresponding to each character. Each time a character is encountered, the corresponding counter is incremented by 1. Finally, if the two counter lists are the same, then the two strings are definitely out-of-order words.

def anagramSolution4(s1, s2):
       c1=[0]*26
       c2=[0]*26

       for i in range(len(s1)):
           pos = ord(s1[i]) - ord('a')
           c1[pos] = c1[pos] + 1

       for i in range(len(s2)):
          pos = ord(s2[i]) - ord('a')
          c2[pos] = c2[pos] + 1
       j=0
       stillOK = True
       while j < 26 and stillOK:
             if c1[j] == c2[j]:
                j=j+1
             else:
                stillOK = False
       return stillOK

This scenario also has loops. However, unlike scheme 1, the loops in this scheme are not nested. The first two counting loops are nth order. The third loop compares the two lists, and since there are 26 possible characters, it loops 26 times. Adding them all up gives a total number of steps T (n) = 2n - 26, which is O(n). We have found a linear order algorithm for solving the heteroscedastic word detection problem.

4. Complexity of list and dictionary operations

4.1 Lists

4.2 Dictionaries

This article on python data structure algorithm analysis is introduced to this article, more related python algorithm analysis content please search for my previous articles or continue to browse the following related articles I hope you will support me in the future!