[SOLVED] Data science and Longest Common subsequence Python

I need support with this Computer Science question so I can learn better.

Struggling to find relevant content or pressed for time? – Don’t worry, we have a team of professionals to help you on
[SOLVED] Data science and Longest Common subsequence Python
Get a 15% Discount on this Paper
Order Now

I have attached the .txt file, the function to import the .txt file, and the recursive longest common subsequence code.

Q2) (20 points) Data Science is one of the most popular areas where Python is widely used. In this question, you will have an opportunity to put your tiny tiny first baby step into this territory. You will also develop your own simple hash table using the existing standard Python data types.

  1. Modify the LCS recursive version to count the number of recursive calls for each 6-digit integer string against “0123456789” string. The function returns a tuple of two elements (LCS num, the number of recursive calls to find the LCS)
  2. I think I found a pretty good (but very slow) hash function of integer strings, which is the recursive LCS function. 🙂 Use this function as your hash function to store the integer strings you read in Q1) into your own hash table. Please do not use Python dict() data type directly. You should develop your own hash table where the keys are the number of recursive calls as computed from the recursive LCS function.
  3. Now, regarding how good the new hash function is in terms of generating keys uniformly, we want to check it by counting the number of collisions for each and every key in the hash table from 1M integers. It would be very ideal if the average collision number is close to 100 for 10,000 buckets out of 1M numbers. We can get an idea of it by computing the average number of collisions, but we may also visualize the distribution of the collisions across all the keys using Python plot library.
  4. Use a plotting library (https://plot.ly/python/ (Links to an external site.)) to visualize the distribution of key collision of the LCS hash function:

import matplotlib.pyplot as plt
from plotly.graph_objs import Bar, Layout
from plotly import offline

# or any other library of your choice such as panda, if you prefer

    • Draw an offline bar graph to show the relationship between hash keys (the number of recursive calls to compute the LCS against “0123456789” ) and the number of collisions for each hash key. You may refer to the chart that I came up with using python dict() class (I could have used collections. Counter class for that matter) but again you should use your own hash table for this homework which maps each key to the corresponding list of collisions. In any case, the final chart should look very similar, if not identical. << lcsUniformity.html >>

5. From my personal experience, I firmly believe that no meaningful data analysis can be completed without interjecting human intuition in the data interpretation process. Let’s exercise our own. Look at my chart mentioned above. It’s not perfect but it appears fairly uniformly distributed across the keys (LCS numbers). But it you look at it closely, after passing certain threshold, the collision rate tapering off, which means that bigger the number of recursion gets, smaller the collisions (frequencies of hash keys generated in the range) gets. They are thinly spread out. Also note that these numbers would take much longer time to get a hash key using the LCS hash function because it makes many more recursive calls to finish the recursion. So I would like to limit the number of recursive calls when it passes a certain threshold. With that in mind,

  • using your intuition, from your own bar graph, pick up a threshold number and give your reasons explaining why you picked up the number. No right or wrong answer here.
  • To handle those numbers that go over the threshold, you may develop and run a simple secondary hash function of your own choice to collapse those outliers to a shorter range so that the upper bound of entire hash table of yours is below or equal to 10,000.
  • Then run your new algorithm to redraw the distribution across new set of keys added to the existing keys.

Since running LCS for the million numbers in rand1000000.txt file takes quite a long time, you may first test it with rand1000,txt file ( less than 10 seconds)

Q2 Deliverable :

  • Python codes of modified LCS recursive and the driver program
  • The offline html files of the hash table collision charts for both before and after modification

Calculate the price
Make an order in advance and get the best price
Pages (550 words)
$0.00
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
Sign up, place your order, and leave the rest to our professional paper writers in less than 2 minutes.
step 1
Upload assignment instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
s
Get personalized services with MyCoursebay
One writer for all your papers
You can select one writer for all your papers. This option enhances the consistency in the quality of your assignments. Select your preferred writer from the list of writers who have handledf your previous assignments
Same paper from different writers
Are you ordering the same assignment for a friend? You can get the same paper from different writers. The goal is to produce 100% unique and original papers
Copy of sources used
Our homework writers will provide you with copies of sources used on your request. Just add the option when plaing your order
What our partners say about us
We appreciate every review and are always looking for ways to grow. See what other students think about our do my paper service.
DISCUSSION D SCIENCE 210
GREAT
Customer 452813, June 28th, 2022
Database design and optimization
thanks for busting this out so expeditiously. I hope that I get a good grade.
Customer 452715, February 19th, 2022
Nursing
Excellent as usual. Thank you!
Customer 452707, June 24th, 2023
Technology
My paper was sent back after my due date time
Customer 452901, November 12th, 2022
Nursing
Did not receive paper on time.
Customer 452693, November 9th, 2022
Nursing
Thank you. Well done
Customer 452881, October 22nd, 2023
Professions and Applied Sciences
Thank you!
Customer 452707, March 4th, 2022
Nursing
Great writing! Really appreciate your help!
Customer 452503, April 22nd, 2021
Other
Excellent
Customer 452813, August 21st, 2023
Education
Great
Customer 452813, June 29th, 2023
Nursing
Thank you to the writer and also thank you to the support team I got an A for the paper
Customer 452635, June 17th, 2022
Other
Great work! Thank so much!
Customer 452707, March 1st, 2022
OUR GIFT TO YOU
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat

Good News ! We now help with PROCTORED EXAM. Chat with a support agent for more information

NEW

Thank you for choosing MyCoursebay. Your presence is a motivation to us. All papers are written from scratch. Plagiarism is not tolerated. Order now for a 15% discount

Order Now