Answer to Question #266272 in Python for sindhu

Question #266272

Write a PYTHON script 


Store https://www.genecards.org/cgi-bin/cardlisttxt.pl it in a flat file.


The GeneCards database currently contains 270,168 GeneCards


Parse the first 10 genes from each series (1A9N_Q-ZZZ3) https://www.genecards.org/cgi-bin/carddisp.pl?gene=GENE NAME


  If the genes are less than 10 then parse all.

Extract Genomic Locations for GENE NAME and store it in a file for each gene you parse.For example  


Open https://www.genecards.org/cgi-bin/carddisp.pl?gene=A1BG


Do scraping for “Genomic Locations for A1BG Gene”, you will see


Genomic Locations for A1BG Gene


chr19:58,345,178-58,353,492(GRCh38/hg38)


Size:8,315 bases


Orientation:Minus strand



Store the scrapped output into a file and rendered it in HTML as it looks in genecard




 


1
Expert's answer
2021-11-15T17:28:34-0500


SOLUTION TO THE ABOVE QUESTION


SOLUTION CODE


import requests
import html
#define a function to get the gene_card request
def gene_card_request():
    #our url is https://www.genecards.org/cgi-bin/cardlisttxt.pl
    url_to_request = 'https://www.genecards.org/cgi-bin/cardlisttxt.pl?gene='
    headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
    r = requests.get(url_to_request, headers=headers)
    gene_card_html = html.unescape(r.text)
    return gene_card_html


print(gene_card_request())




Need a fast expert's response?

Submit order

and get a quick answer at the best price

for any assignment or question with DETAILED EXPLANATIONS!

Comments

No comments. Be the first!

Leave a comment

LATEST TUTORIALS
New on Blog
APPROVED BY CLIENTS