Popcorn Hacks:

Predictive analysis is the use of statistical, data mining, and machine learning techniques to analyze current and historical data in order to make predictions about future events or behaviors. It involves identifying patterns and trends in data, and then using that information to forecast what is likely to happen in the future.

Predictive analysis is used in a wide range of applications, from forecasting sales and demand, to predicting customer behavior, to detecting fraudulent transactions. It involves collecting and analyzing data from a variety of sources, including historical data, customer data, financial data, and social media data, among others.

The process of predictive analysis typically involves the following steps:

  1. Defining the problem and identifying the relevant data sources
  2. Collecting and cleaning the data
  3. Exploring and analyzing the data to identify patterns and trends
  4. Selecting an appropriate model or algorithm to use for predictions
  5. Training and validating the model using historical data
  6. Using the model to make predictions on new data
  7. Monitoring and evaluating the performance of the model over time

Predictive analysis can help organizations make more informed decisions, improve efficiency, and gain a competitive advantage by leveraging insights from data.

It is most commonly used in Retail, where workers try to predict which products would be most popular and try to advertise those products as much as possible, and also Healthcare, where algorithms analyze patterns and reveal prerequisites for diseases and suggest preventive treatment, predict the results of various treatments and choose the best option for each patient individually, and predict disease outbreaks and epidemics.

An array is the central data structure of the NumPy library. They are used as containers which are able to store more than one item at the same time. Using the function np.array is used to create an array, in which you can create multidimensional arrays.

import numpy as np

# create 3D array here
x = np.array([[[1,1,2], [2,3,3]], [[1,1,1], [1,1,1]]])
print(x.shape)
print(x)
(2, 2, 3)
[[[1 1 2]
  [2 3 3]]

 [[1 1 1]
  [1 1 1]]]

3. Basic array operations

One of the most basic operations that can be performed on arrays is arithmetic operations. With numpy, it is very easy to perform arithmetic operations on arrays. You can add, subtract, multiply and divide arrays, just like you would with regular numbers. When performing these operations, numpy applies the operation element-wise, meaning that it performs the operation on each element in the array separately. This makes it easy to perform operations on large amounts of data quickly and efficiently.

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# calculate sin
f = np.sin(b)
# calculate cos
g = np.cos(b)
# calculate tan
h = np.tan(b)
# calculate natural log
i = np.log(b)
# calculate log10
j = np.log10(b)

print(f)
print(g)
print(h)
print(i)
print(j)
[-0.7568025  -0.95892427 -0.2794155 ]
[-0.65364362  0.28366219  0.96017029]
[ 1.15782128 -3.38051501 -0.29100619]
[1.38629436 1.60943791 1.79175947]
[0.60205999 0.69897    0.77815125]

4. Data analysis using numpy

Numpy provides a convenient and powerful way to perform data analysis tasks on large datasets. One of the most common tasks in data analysis is finding the mean, median, and standard deviation of a dataset. Numpy provides functions to perform these operations quickly and easily. The mean function calculates the average value of the data, while the median function calculates the middle value in the data. The standard deviation function calculates how spread out the data is from the mean. Additionally, numpy provides functions to find the minimum and maximum values in the data. These functions are very useful for gaining insight into the properties of large datasets and can be used for a wide range of data analysis tasks.

data = np.array([2, 5, 12, 13, 19])

# create a different way of solving the sum or products of a dataset from what we learned above
print(np.sum(data))
print(np.product(data))
51
29640

Main Hack:

import numpy as np

age = np.loadtxt('files/age.csv', delimiter=',', dtype=str, encoding='utf-8')
print(age)
['Age' '18' '19' '20' '21' '22' '23' '24' '25' '26' '27' '28' '29' '30']

Panda:

Min:

age= [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
print(np.min(age))
18

Max:

age= [18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
print(np.max(age))
30

Numpy:

import numpy as np

# Generate a single random number
random_number = np.random.rand()
print(random_number)

# Generate an array of random numbers
random_array = np.random.rand(5)  # Generates an array of 5 random numbers
print(random_array)

# Generate a 2D array of random numbers
random_matrix = np.random.rand(3, 2)  # Generates a 3x2 matrix of random numbers
print(random_matrix)

# Generate random integers within a specific range
random_int = np.random.randint(1, 100)  # Generates a random integer between 1 and 10 (inclusive)
print(random_int)

# Generate an array of random integers within a specific range
random_int_array = np.random.randint(1, 10, size=5)  # Generates an array of 5 random integers between 1 and 10 (inclusive)
print(random_int_array)
0.22694578746731575
[0.90154584 0.14164067 0.1356398  0.17760852 0.8175945 ]
[[0.35257292 0.22036514]
 [0.90568225 0.00206035]
 [0.5691113  0.27038592]]
58
[8 3 9 2 4]
import numpy as np

array = np.array([[[1,11,2], [2,13,3]], [[1,10,1], [1,1,1]]])
print(array.shape)
print(array)
(2, 2, 3)
[[[ 1 11  2]
  [ 2 13  3]]

 [[ 1 10  1]
  [ 1  1  1]]]
import numpy as np

linear_array = np.linspace(0, 20, 15)
print(linear_array)
[ 0.          1.42857143  2.85714286  4.28571429  5.71428571  7.14285714
  8.57142857 10.         11.42857143 12.85714286 14.28571429 15.71428571
 17.14285714 18.57142857 20.        ]