How to Add a Key to a Dictionary in Python From Excel

XLRD/Python: Reading Excel file into dict with for-loops

I'm looking to read in an Excel workbook with 15 fields and about 2000 rows, and convert each row to a dictionary in Python. I then want to append each dictionary to a list. I'd like each field in the top row of the workbook to be a key within each dictionary, and have the corresponding cell value be the value within the dictionary. I've already looked at examples here and here, but I'd like to do something a bit different. The second example will work, but I feel like it would be more efficient looping over the top row to populate the dictionary keys and then iterate through each row to get the values. My Excel file contains data from discussion forums and looks something like this (obviously with more columns):

                              id                thread_id    forum_id    post_time    votes    post_text                4                100                3                1377000566                1                'here is some text'                5                100                4                1289003444                0                'even more text here'                          

So, I'd like the fields id, thread_id and so on, to be the dictionary keys. I'd like my dictionaries to look like:

              {id:                4,  thread_id:                100, forum_id:                3, post_time:                1377000566, votes:                1, post_text:                'here is some text'}                          

Initially, I had some code like this iterating through the file, but my scope is wrong for some of the for-loops and I'm generating way too many dictionaries. Here's my initial code:

                              import                xlrd                from                xlrd                import                open_workbook, cellname  book = open_workbook('forum.xlsx',                'r') sheet = book.sheet_by_index(3)  dict_list = []                for                row_index                in                range(sheet.nrows):                for                col_index                in                range(sheet.ncols):         d = {}                # My intuition for the below for-loop is to take each cell in the top row of the                                # Excel sheet and add it as a key to the dictionary, and then pass the value of                                # current index in the above loops as the value to the dictionary. This isn't                # working.                for                i                in                sheet.row(0):            d[str(i)] = sheet.cell(row_index, col_index).value            dict_list.append(d)                          

Any help would be greatly appreciated. Thanks in advance for reading.

Answer #1:

The idea is to, first, read the header into the list. Then, iterate over the sheet rows (starting from the next after the header), create new dictionary based on header keys and appropriate cell values and append it to the list of dictionaries:

                          from              xlrd              import              open_workbook  book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3)              # read header values into the list                            keys = [sheet.cell(0, col_index).value              for              col_index              in              xrange(sheet.ncols)]  dict_list = []              for              row_index              in              xrange(1, sheet.nrows):     d = {keys[col_index]: sheet.cell(row_index, col_index).value              for              col_index              in              xrange(sheet.ncols)}     dict_list.append(d)              print              dict_list                      

For a sheet containing:

            A   B   C   D              1              2              3              4              5              6              7              8                      

it prints:

            [{'A':              1.0,              'C':              3.0,              'B':              2.0,              'D':              4.0},   {'A':              5.0,              'C':              7.0,              'B':              6.0,              'D':              8.0}]                      

UPD (expanding the dictionary comprehension):

            d = {}              for              col_index              in              xrange(sheet.ncols):     d[keys[col_index]] = sheet.cell(row_index, col_index).value                      

Answer #2:

                          from              xlrd              import              open_workbook  dict_list = [] book = open_workbook('forum.xlsx') sheet = book.sheet_by_index(3)              # read first row for keys                            keys = sheet.row_values(0)              # read the rest rows for values              values = [sheet.row_values(i)              for              i              in              range(1, sheet.nrows)]              for              value              in              values:     dict_list.append(dict(zip(keys, value)))              print              dict_list                      

Answer #3:

Try this one. This function below will return generator contains dict of each row and column.

                          from              xlrd              import              open_workbook              for              row              in              parse_xlsx():              print              row              # {id: 4, thread_id: 100, forum_id: 3, post_time: 1377000566, votes: 1, post_text: 'here is some text'}                              def                parse_xlsx():              workbook = open_workbook('excelsheet.xlsx')     sheets = workbook.sheet_names()     active_sheet = workbook.sheet_by_name(sheets[0])     num_rows = active_sheet.nrows     num_cols = active_sheet.ncols     header = [active_sheet.cell_value(0, cell).lower()              for              cell              in              range(num_cols)]              for              row_idx              in              xrange(1, num_rows):         row_cell = [active_sheet.cell_value(row_idx, col_idx)              for              col_idx              in              range(num_cols)]              yield              dict(zip(header, row_cell))                      

Answer #4:

Try to first set up your keys by parsing just the first line, all columns, another function to parse the data, then call them in order.

            all_fields_list = [] header_dict = {}                              def                parse_data_headers(sheet):              global              header_dict              for              c              in              range(sheet.ncols):        key = sheet.cell(1, c)              #here 1 is the row number where your header is              header_dict[c] = key              #store it somewhere, here I have chosen to store in a dict                              def                parse_data(sheet):              for              r              in              range(2, sheet.nrows):        row_dict = {}              for              c              in              range(sheet.ncols):            value = sheet.cell(r,c)            row_dict[c] = value        all_fields_list.append(row_dict)                      

Answer #5:

This script allow you to transform a excel data to list of dictionnary

                          import              xlrd  workbook = xlrd.open_workbook('forum.xls') workbook = xlrd.open_workbook('forum.xls', on_demand =              True) worksheet = workbook.sheet_by_index(0) first_row = []              # The row where we stock the name of the column              for              col              in              range(worksheet.ncols):     first_row.append( worksheet.cell_value(0,col) )              # tronsform the workbook to a list of dictionnary              data =[]              for              row              in              range(1, worksheet.nrows):     elm = {}              for              col              in              range(worksheet.ncols):         elm[first_row[col]]=worksheet.cell_value(row,col)     data.append(elm)              print              data                      

How to Add a Key to a Dictionary in Python From Excel

Source: https://www.py4u.net/discuss/156837

0 Response to "How to Add a Key to a Dictionary in Python From Excel"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel