Return a file as dictionary

So here is a file

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end 

apple is a sour fruit

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

This is a file which also has new line characters.
I want tp create a dictionary using the file. it goes like this

the function is def f(file_to: (TextIO))-> Dict[str, List[tuple]]

file_to is file name entered and it will return the dictionary like,

{'apple': [('apple is a tasty fruit', 2018, 'garden', 'tasty', 5), (apple is a sour fruit)], 'grapes':['grapes is a sweet fruit', 24, 'organic', 5)]}

each of the fruit is key and their discription is values as formatted there. Each fruits ends at >>>end

I tried

with open (file_to, "r") as myfile:

    data= myfile.readlines()

return data

it returns the file strings in a list with /n I'm thinking I can use strip() to remove that and get the element that comes before ':' as keys.

The code I tried is

from pprint import pprint

import re

def main():

    fin = open('f1.txt', 'r')



    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            data[key] = parsed

            parsed = 

        elif ':' in line:

            parts = re.split('W+', line)

            key = parts[0].lower()

            parsed += parts[2:]

        else:

            parsed.insert(0, line)



    fin.close()

    pprint(data)





main()

It's not giving the right expected result :(

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

Your attempt doesn't match the annotations.

– TheIncorrigible1
Nov 28 '18 at 17:12

why not just use JSON or XML?

– Dennis Patterson
Nov 28 '18 at 17:19

1

@DennisPatterson It sounds like they're being handed a requirement and can't change the process (given the function snippet)

– TheIncorrigible1
Nov 28 '18 at 17:24

add a comment |

So here is a file

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end 

apple is a sour fruit

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

This is a file which also has new line characters.
I want tp create a dictionary using the file. it goes like this

the function is def f(file_to: (TextIO))-> Dict[str, List[tuple]]

file_to is file name entered and it will return the dictionary like,

{'apple': [('apple is a tasty fruit', 2018, 'garden', 'tasty', 5), (apple is a sour fruit)], 'grapes':['grapes is a sweet fruit', 24, 'organic', 5)]}

each of the fruit is key and their discription is values as formatted there. Each fruits ends at >>>end

I tried

with open (file_to, "r") as myfile:

    data= myfile.readlines()

return data

it returns the file strings in a list with /n I'm thinking I can use strip() to remove that and get the element that comes before ':' as keys.

The code I tried is

from pprint import pprint

import re

def main():

    fin = open('f1.txt', 'r')



    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            data[key] = parsed

            parsed = 

        elif ':' in line:

            parts = re.split('W+', line)

            key = parts[0].lower()

            parsed += parts[2:]

        else:

            parsed.insert(0, line)



    fin.close()

    pprint(data)





main()

It's not giving the right expected result :(

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

Your attempt doesn't match the annotations.

– TheIncorrigible1
Nov 28 '18 at 17:12

why not just use JSON or XML?

– Dennis Patterson
Nov 28 '18 at 17:19

1

@DennisPatterson It sounds like they're being handed a requirement and can't change the process (given the function snippet)

– TheIncorrigible1
Nov 28 '18 at 17:24

add a comment |

So here is a file

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end 

apple is a sour fruit

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

This is a file which also has new line characters.
I want tp create a dictionary using the file. it goes like this

the function is def f(file_to: (TextIO))-> Dict[str, List[tuple]]

file_to is file name entered and it will return the dictionary like,

{'apple': [('apple is a tasty fruit', 2018, 'garden', 'tasty', 5), (apple is a sour fruit)], 'grapes':['grapes is a sweet fruit', 24, 'organic', 5)]}

each of the fruit is key and their discription is values as formatted there. Each fruits ends at >>>end

I tried

with open (file_to, "r") as myfile:

    data= myfile.readlines()

return data

it returns the file strings in a list with /n I'm thinking I can use strip() to remove that and get the element that comes before ':' as keys.

The code I tried is

from pprint import pprint

import re

def main():

    fin = open('f1.txt', 'r')



    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            data[key] = parsed

            parsed = 

        elif ':' in line:

            parts = re.split('W+', line)

            key = parts[0].lower()

            parsed += parts[2:]

        else:

            parsed.insert(0, line)



    fin.close()

    pprint(data)





main()

It's not giving the right expected result :(

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

So here is a file

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end 

apple is a sour fruit

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

This is a file which also has new line characters.
I want tp create a dictionary using the file. it goes like this

the function is def f(file_to: (TextIO))-> Dict[str, List[tuple]]

file_to is file name entered and it will return the dictionary like,

{'apple': [('apple is a tasty fruit', 2018, 'garden', 'tasty', 5), (apple is a sour fruit)], 'grapes':['grapes is a sweet fruit', 24, 'organic', 5)]}

each of the fruit is key and their discription is values as formatted there. Each fruits ends at >>>end

I tried

with open (file_to, "r") as myfile:

    data= myfile.readlines()

return data

it returns the file strings in a list with /n I'm thinking I can use strip() to remove that and get the element that comes before ':' as keys.

The code I tried is

from pprint import pprint

import re

def main():

    fin = open('f1.txt', 'r')



    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            data[key] = parsed

            parsed = 

        elif ':' in line:

            parts = re.split('W+', line)

            key = parts[0].lower()

            parsed += parts[2:]

        else:

            parsed.insert(0, line)



    fin.close()

    pprint(data)





main()

It's not giving the right expected result :(

python file dictionary

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

edited Nov 28 '18 at 17:22

asked Nov 28 '18 at 17:08

Comp

456

asked Nov 28 '18 at 17:08

Comp

456

asked Nov 28 '18 at 17:08

Comp

456

Your attempt doesn't match the annotations.

– TheIncorrigible1
Nov 28 '18 at 17:12

why not just use JSON or XML?

– Dennis Patterson
Nov 28 '18 at 17:19

1

@DennisPatterson It sounds like they're being handed a requirement and can't change the process (given the function snippet)

– TheIncorrigible1
Nov 28 '18 at 17:24

add a comment |

Your attempt doesn't match the annotations.

– TheIncorrigible1
Nov 28 '18 at 17:12

why not just use JSON or XML?

– Dennis Patterson
Nov 28 '18 at 17:19

1

@DennisPatterson It sounds like they're being handed a requirement and can't change the process (given the function snippet)

– TheIncorrigible1
Nov 28 '18 at 17:24

Your attempt doesn't match the annotations.

– TheIncorrigible1
Nov 28 '18 at 17:12

why not just use JSON or XML?

– Dennis Patterson
Nov 28 '18 at 17:19

@DennisPatterson It sounds like they're being handed a requirement and can't change the process (given the function snippet)

– TheIncorrigible1
Nov 28 '18 at 17:24

add a comment |

2 Answers
2

active

oldest

votes

I don't think that you really need re and pprint. I have tried with an easy list comprehension and some if statements.

def main:

    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            continue # If we get a line which starts with a '>', we can skip that line.

        elif ':' in line:

            parts = line.strip().split(":")

            key = parts[0].lower()



            firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line

            firstInfo.pop(0) # Removing the first element, The State name (as it is not required).



            secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.



            value = [secondInfo]



            value.extend([x for x in firstInfo]) # Extending the value list to add other elements.



            data[key] = value



    print(data["apple"])

    return data

If you encounter any problem with this implementation, I will be happy to help. (although this is self explanatory :P)

answered Nov 28 '18 at 17:25

MaJoR

536215

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

1

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

add a comment |

I made some adjustments to your code (which I gave you in a previous post). I think this gives what you want with your updated data.

The data:

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end

apple is a sour fruit

apple is ripe

>>>end

apple is red

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

And here is the updated code:

import re



def main():

    fin = open('f1.txt', 'r')



    data = {}



    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            if key not in data:

                data[key] = [tuple(parts)]



        elif re.match('^w+:s', line):

            key, _, *parts = re.split('[:,]s+', line)

        else:

            if key in data:

                data[key].append(line)

            else:

                parts.insert(0, line)



    fin.close()



    for key in data:

        if len(data[key]) > 1:

            data[key][1] = tuple(data[key][1:])

            del data[key][2:]



    print(data)





main()

The output from this revised data and code is:

{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

1

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

add a comment |

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53524694%2freturn-a-file-as-dictionary%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

I don't think that you really need re and pprint. I have tried with an easy list comprehension and some if statements.

def main:

    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            continue # If we get a line which starts with a '>', we can skip that line.

        elif ':' in line:

            parts = line.strip().split(":")

            key = parts[0].lower()



            firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line

            firstInfo.pop(0) # Removing the first element, The State name (as it is not required).



            secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.



            value = [secondInfo]



            value.extend([x for x in firstInfo]) # Extending the value list to add other elements.



            data[key] = value



    print(data["apple"])

    return data

If you encounter any problem with this implementation, I will be happy to help. (although this is self explanatory :P)

answered Nov 28 '18 at 17:25

MaJoR

536215

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

1

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

add a comment |

I don't think that you really need re and pprint. I have tried with an easy list comprehension and some if statements.

def main:

    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            continue # If we get a line which starts with a '>', we can skip that line.

        elif ':' in line:

            parts = line.strip().split(":")

            key = parts[0].lower()



            firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line

            firstInfo.pop(0) # Removing the first element, The State name (as it is not required).



            secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.



            value = [secondInfo]



            value.extend([x for x in firstInfo]) # Extending the value list to add other elements.



            data[key] = value



    print(data["apple"])

    return data

If you encounter any problem with this implementation, I will be happy to help. (although this is self explanatory :P)

answered Nov 28 '18 at 17:25

MaJoR

536215

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

1

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

add a comment |

I don't think that you really need re and pprint. I have tried with an easy list comprehension and some if statements.

def main:

    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            continue # If we get a line which starts with a '>', we can skip that line.

        elif ':' in line:

            parts = line.strip().split(":")

            key = parts[0].lower()



            firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line

            firstInfo.pop(0) # Removing the first element, The State name (as it is not required).



            secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.



            value = [secondInfo]



            value.extend([x for x in firstInfo]) # Extending the value list to add other elements.



            data[key] = value



    print(data["apple"])

    return data

If you encounter any problem with this implementation, I will be happy to help. (although this is self explanatory :P)

answered Nov 28 '18 at 17:25

MaJoR

536215

I don't think that you really need re and pprint. I have tried with an easy list comprehension and some if statements.

def main:

    data = {}

    key = ''

    parsed = 

    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            continue # If we get a line which starts with a '>', we can skip that line.

        elif ':' in line:

            parts = line.strip().split(":")

            key = parts[0].lower()



            firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line

            firstInfo.pop(0) # Removing the first element, The State name (as it is not required).



            secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.



            value = [secondInfo]



            value.extend([x for x in firstInfo]) # Extending the value list to add other elements.



            data[key] = value



    print(data["apple"])

    return data

If you encounter any problem with this implementation, I will be happy to help. (although this is self explanatory :P)

answered Nov 28 '18 at 17:25

MaJoR

536215

answered Nov 28 '18 at 17:25

MaJoR

536215

answered Nov 28 '18 at 17:25

MaJoR

536215

answered Nov 28 '18 at 17:25

MaJoR

536215

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

1

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

add a comment |

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

1

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

Thank you for your help. but what if APPLE: toronto, 2018, garden, tasty, 5 apple is: a tasty fruit has a ':' in the middle because it consider : as another main key but it's not because what ever comes before >>>end considered as one key and its value. but what if a value has : ?

– Comp
Nov 28 '18 at 22:24

@Comp I revised my code (again :-P ) to take this situation into account. elif re.match('^w+:s', line): uses the regular expression ^w+:s to identify the line hopefully for the key and rest of the line. the ^ says to start the match at the beginning, w+ says to match one or more letters immediately followed by : and a space s That should not allow a match if : is somewhere else in a line that is not a header line.

– Chris Charley
Nov 28 '18 at 23:27

@Comp that case is unlikely to happen, because I am reading another line (secondInfo = fin.readline().strip()) when I get my first key (assuming that the first line of the data contains the key separated by :. However, @Chris's answer is also correct, and much better in handling the key detection.

– MaJoR
Nov 29 '18 at 7:57

add a comment |

I made some adjustments to your code (which I gave you in a previous post). I think this gives what you want with your updated data.

The data:

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end

apple is a sour fruit

apple is ripe

>>>end

apple is red

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

And here is the updated code:

import re



def main():

    fin = open('f1.txt', 'r')



    data = {}



    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            if key not in data:

                data[key] = [tuple(parts)]



        elif re.match('^w+:s', line):

            key, _, *parts = re.split('[:,]s+', line)

        else:

            if key in data:

                data[key].append(line)

            else:

                parts.insert(0, line)



    fin.close()



    for key in data:

        if len(data[key]) > 1:

            data[key][1] = tuple(data[key][1:])

            del data[key][2:]



    print(data)





main()

The output from this revised data and code is:

{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

1

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

add a comment |

I made some adjustments to your code (which I gave you in a previous post). I think this gives what you want with your updated data.

The data:

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end

apple is a sour fruit

apple is ripe

>>>end

apple is red

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

And here is the updated code:

import re



def main():

    fin = open('f1.txt', 'r')



    data = {}



    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            if key not in data:

                data[key] = [tuple(parts)]



        elif re.match('^w+:s', line):

            key, _, *parts = re.split('[:,]s+', line)

        else:

            if key in data:

                data[key].append(line)

            else:

                parts.insert(0, line)



    fin.close()



    for key in data:

        if len(data[key]) > 1:

            data[key][1] = tuple(data[key][1:])

            del data[key][2:]



    print(data)





main()

The output from this revised data and code is:

{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

1

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

add a comment |

I made some adjustments to your code (which I gave you in a previous post). I think this gives what you want with your updated data.

The data:

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end

apple is a sour fruit

apple is ripe

>>>end

apple is red

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

And here is the updated code:

import re



def main():

    fin = open('f1.txt', 'r')



    data = {}



    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            if key not in data:

                data[key] = [tuple(parts)]



        elif re.match('^w+:s', line):

            key, _, *parts = re.split('[:,]s+', line)

        else:

            if key in data:

                data[key].append(line)

            else:

                parts.insert(0, line)



    fin.close()



    for key in data:

        if len(data[key]) > 1:

            data[key][1] = tuple(data[key][1:])

            del data[key][2:]



    print(data)





main()

The output from this revised data and code is:

{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

I made some adjustments to your code (which I gave you in a previous post). I think this gives what you want with your updated data.

The data:

APPLE: toronto, 2018, garden, tasty, 5

apple is a tasty fruit

>>>end

apple is a sour fruit

apple is ripe

>>>end

apple is red

>>>end

grapes: america, 24, organic, sweet, 4

grapes is a sweet fruit

>>>end

And here is the updated code:

import re



def main():

    fin = open('f1.txt', 'r')



    data = {}



    for line in fin:

        line = line.rstrip()

        if line.startswith('>'):

            if key not in data:

                data[key] = [tuple(parts)]



        elif re.match('^w+:s', line):

            key, _, *parts = re.split('[:,]s+', line)

        else:

            if key in data:

                data[key].append(line)

            else:

                parts.insert(0, line)



    fin.close()



    for key in data:

        if len(data[key]) > 1:

            data[key][1] = tuple(data[key][1:])

            del data[key][2:]



    print(data)





main()

The output from this revised data and code is:

{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

edited Nov 28 '18 at 23:18

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

answered Nov 28 '18 at 20:03

Chris Charley

3,72521618

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

1

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

add a comment |

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

1

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

Thanks Chris for your help. I got an error if key in data: builtins.UnboundLocalError: local variable 'key' referenced before assignment

– Comp
Dec 3 '18 at 3:30

@Comp Then your data sequence doesn't match the sequence you supplied in your sample. It's likely that elif re.match('^w+:s', line): isn't matching the line because there might be non-word (a9zA-Z0-9) characters in the part preceeding the colon.

– Chris Charley
Dec 3 '18 at 16:24

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Btukfyl