How to Get File Size In Python? | Python Size Check

Being able to programmatically check file sizes is an important skill for any Python developer. You may need to retrieve sizes for monitoring disk usage, debugging file issues, building GUIs to display sizes, and many other tasks.

In this comprehensive Python guide, you’ll learn several different techniques to get file sizes using functions from os, pathlib, and more. By the end, you’ll be able to:

And much more. Let’s get started!

Overview of File Size in Python

First, what exactly is a file size in Python?

  • File sizes are measured in bytes – the number of bytes of data stored in the file.
  • Text files are stored as encoded text (Unicode, UTF-8, etc). Their size depends on the encoding.
  • Binary files like images, videos, etc. have sizes matching the exact number of bytes of raw data.
  • File sizes allow measuring how much disk space a file consumes.
  • The OS tracks sizes that Python can query through various interfaces.
  • Sizes can be used for monitoring usage, debugging, display in UIs, and more.

Some key points:

So in summary, Python can easily read sizes in bytes to work with file content programmatically.

Retrieving File Sizes in Python

Python has several approaches for getting file sizes through the os, pathlib, and other modules. Let’s go through each method with examples.

1. The os.path.getsize() Function

The simplest way is using os.path.getsize():

import os

size = os.path.getsize('file.txt') 
print(size) # prints size in bytes
JavaScript

This takes a file path and returns the size. Some key points:

  • Returns size in bytes as an integer.
  • Works on any valid file path on the system.
  • Handles both text and binary files.
  • Raises OSError for invalid file paths or permission issues.
  • No need to open the file first to get size.

For example:

print(os.path.getsize('documents/report.pdf')) 
# Perhaps returns 65300
JavaScript

So os.path.getsize() is the simplest way to check a file size by path.

2. Using os.stat()

The os.stat() function returns a file’s full stat info as an object:

import os

stats = os.stat('file.txt')
JavaScript

This stat_result object contains the size in st_size:

size = stats.st_size
print(size) # size in bytes
JavaScript

os.stat() is useful when you need other file info like modified times, permissions, etc. It avoids multiple calls.

For example:

stats = os.stat('documents/report.pdf')

print(f'Size: {stats.st_size} bytes')
print(f'Modified: {stats.st_mtime}') 
JavaScript

So os.stat() is great when you need multiple file properties, not just the size.

3. Using File Objects

We can also use file objects to get sizes:

file = open('file.txt') 

file.seek(0, os.SEEK_END) # Seek to end 
size = file.tell()

print(size)
JavaScript

This seeks to the end to get the total size, which gets returned from tell().

Some advantages over the other methods:

The downside is having to open and manage the file rather than just passing a path.

So file objects are best for when you have an open file and need to process it incrementally.

4. The pathlib Module

The pathlib module also allows getting sizes:

from pathlib import Path

size = Path('file.txt').stat().st_size
print(size) 
JavaScript

Path.stat() returns the same stat_result object as os.stat().

Some benefits of pathlib:

  • More object-oriented approach to files rather than raw paths.
  • Integrates well with other file handling and I/O code.
  • Allows chaining other file operations like permissions, etc.

Overall pathlib is great for more robust file handling.

So in summary, several approaches allow getting file sizes – pick the right one for your needs!

Handling Binary vs Text Files

One key point – file sizes mean different things for text vs. binary formats:

  • Binary files like images, videos, etc. have sizes equal to the full data size. A 5 MB image will have a file size around 5000000 bytes.
  • Text files use encoding like UTF-8 to represent characters. The file size depends on the encoding, not the actual text content.

For example, say hello.txt just contains the text "Hello world". The size could be:

  • 11 bytes encoded as ASCII
  • 48 bytes encoded as UTF-32
  • 12 bytes encoded as UTF-8

So always remember the distinction for text. The size reflects the encoded representation, not the textual content.

Monitoring File Size Changes

To monitor size changes over time, you can:

  • Check sizes periodically in a loop
  • Use a library like watchdog for notifications
  • Bind to OS events for file changes

For example, to poll every 5 seconds:

import time
import os

path = 'file.txt'

while True:
  size = os.path.getsize(path)
  print(f'{path} is currently {size} bytes')

  time.sleep(5) 
JavaScript

This continuously prints updated sizes.

For more advanced monitoring, you can use:

  • The watchdog library to trigger events on file changes.
  • pyinotify to bind to low-level OS events for file modifications.

With these techniques, you can actively monitor size changes in Python.

Sorting Files by Size

To list files sorted by size, you can:

  1. Get sizes for each file
  2. Sort filenames using the sizes

For example:

import os 

files = os.listdir('documents/') 

# Get dict mapping names => sizes 
file_sizes = {f: os.path.getsize(f) for f in files}

# Sort filenames by size
sorted_files = sorted(file_sizes, key=file_sizes.get)  

print(sorted_files)
JavaScript

This prints out all filenames sorted smallest to largest by size.

You can also sort in reverse order for largest first:

sorted_files = sorted(file_sizes, key=file_sizes.get, reverse=True)
JavaScript

Sorting by size allows displaying files in order and identifying the largest space hogs!

Building GUIs to Display Sizes

Python GUI frameworks like Tkinter, PyQt, Django, and Flask allow building UIs to show file sizes.

For example, you can build a simple Tkinter GUI:

import tkinter as tk
from tkinter import ttk
import os

root = tk.Tk()

def get_size():
  file_path = path_entry.get()
  size = os.path.getsize(file_path) 
  size_label['text'] = f'Size: {size} bytes'

path_entry = ttk.Entry(root)
path_entry.pack()

check_button = ttk.Button(root, text="Get Size", command=get_size)  
check_button.pack()

size_label = ttk.Label(root)
size_label.pack()

root.mainloop()
JavaScript

This allows entering a path, clicking “Get Size”, and showing the result.

You can build on this to make a full file explorer GUI with sizes displayed.

Handling Errors

When getting sizes, you may encounter errors like:

  • FileNotFoundError – File does not exist at given path
  • PermissionError – No access permission for file
  • OSError – Other issue with accessing the file

Wrap calls in try/except blocks to handle errors gracefully:

import os

try:
  size = os.path.getsize('file.txt')
except FileNotFoundError:
  print('File not found')
except PermissionError:
  print('No access permission')
JavaScript

This makes your script more robust.

Best Practices

To effectively get and use file sizes, follow these best practices:

  • Always handle errors – don’t ignore them.
  • Remember that text file sizes reflect encoding, not content.
  • Use os.path.getsize() for simple size checks by path.
  • Leverage os.stat() if you need other file info too.
  • Pass file objects when you need incremental processing.
  • Explore pathlib for more Pythonic file handling.
  • Monitor sizes by polling, events, etc. for live updating.
  • Sort filenames by size to organize and understand usage.
  • Know your usage – only retrieve sizes as needed.

Conclusion

You should now be confident retrieving file sizes in Python using functions from os, pathlib, and more!

Some key points:

  • Sizes are in bytes – they reflect disk usage.
  • Multiple methods work by path or file object.
  • Text vs. binary sizes have different meanings.
  • Monitor sizes over time to track changes.
  • Sorting helps organize and understand usage.
  • GUIs can display sizes for usability.

There are many uses for file sizes like analytics, optimization, monitoring, and cleaning. Apply these Python skills to build powerful file handling applications!

For more details, explore the official Python documentation on:

  • os.path – for getsize(), stat(), etc
  • pathlib – for the pathlib module
  • files – for file objects

This covers all the tools you need to handle file sizes like an expert Pythonista. Happy coding!

Leave a Comment