morfizm (morfizm) wrote,

Smart Ass Forum

It annoys me when people are behaving "too smart" and instead of first giving direct answer on technical question, they try to guess what the person really needed and give a different answer. Of course this answer will be helpful for that particular person, but it screwes up the entire forum search experience for someone for whom the answer to the orignal question is the important one.

For example:

Title is "more memory for python script". Almost exactly what I was looking for (I searched for "give more memory to python").

If you look down the thread, user "pmasiar" started debugging the algorithm and also gave incorrect statement about memory management ("Python takes all the memory it needs."). It actually doesn't. On my Windows machine I saw it only take 2 GB and then failed with memory error. It didn't even try to use page file, let alone I had more than 2 GB available.

* * *

The right answer turned out to be: 32 bit Python doesn't attempt to use 64-bit memory APIs, so the memory is limited to 2 GB or 3 GB due to address space limitation:

It's interesting that the task of finding out how much memory is available to Python, is hell damn hard task, because Python wants the developer to abstract away from physical representation of data in memory, so there's no sizeof(). Also, Python does tricks like he doesn't allocate extra memory for strings with the same contents. It probably keeps hashes of all strings and doesn't reserve a new string in case if it's the same. Since strings are immutable, it can do it. This was one of the things I've found out in my memory test.

Python doesn't seem to use this trick with newly generated long numbers: I took factorial implementation from (thanks _m_e_ for link! Wonderful, btw, especially Windows Programmer and Web Designer), then I ran it 10000 times watching the growth of memory footprint in task manager, and then repeated it until I got memory exception:

from array import array  # this is sexy!

def factorial(x):
    res = 1
    for i in xrange(2, x + 1):
        res *= i
    return res

l = []
i = 0
while True:
    # each 10000 takes (27264-15784)*1024 memory
    for j in xrange(0, 10000): 
    i = i + (27264-15784)*1024
    print i/1024/1024,"mb allocated"
I've just noticed that I don't use "array" in this implementation. Actually tried to use arrays of bytes but got inconsistent results RE memory footprint. But I saved the import statement for memory, it's gorgeous!
Tags: in english, software development

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.