Python traversal folder
Traversing Folders with Python
In our daily work, we often need to read a list of files in a folder, perhaps for statistical purposes or to perform certain file operations. Python provides several libraries that can help us traverse folders. In this article, I will introduce three common methods for traversing folders, provide examples, and analyze the advantages and disadvantages of each method.
Method 1: os.walk()
os.walk() is the most commonly used method for traversing folders in Python. This method automatically recursively traverses all subfolders in a folder. During the traversal, it returns a tuple (root, dirs, files), where root represents the path to the currently traversed folder, dirs represents the subfolders within the current folder, and files represents all files within the current folder.
import os
# Traverse a folder
def traversal_folder(folder_path):
for root, dirs, files in os.walk(folder_path):
# Traverse all files in the current folder
for file_name in files:
file_path = os.path.join(root, file_name)
print(file_path)
# Call the function
folder_path = r"D:data"
traversal_folder(folder_path)
Advantage: os.walk() automatically traverses subfolders within a folder, eliminating the need for manual recursion.
Disadvantage: For large folders, os.walk() can consume a significant amount of memory because it needs to store all files in memory, which can cause the program to crash.
Method 2: os.listdir()
os.listdir() is another commonly used method for traversing folders in Python. This method returns a list of all files and folders in the specified path, but does not traverse recursively.
import os
# Traverse a folder
def traversal_folder(folder_path):
for file_name in os.listdir(folder_path):
file_path = os.path.join(folder_path, file_name)
print(file_path)
# Call the function
folder_path = r"D:data"
traversal_folder(folder_path)
Advantage: os.listdir() is faster than os.walk().
Disadvantage: os.listdir() does not perform recursive traversal. If recursive traversal is required, you will need to add recursion to the function.
Method 3: glob.glob()
glob.glob() is a less commonly used method for traversing folders in Python. This method returns a list of all files and folders that match a specified filename pattern. Similar to os.listdir(), it does not perform recursive traversal.
import glob
# Traverse folders
def traversal_folder(folder_path):
for file_path in glob.glob(os.path.join(folder_path, '*')):
print(file_path)
# Call the function
folder_path = r"D:data"
traversal_folder(folder_path)
Advantages: glob.glob() is easy to use and requires minimal code.
Disadvantages: glob.glob() does not perform recursive traversal. If recursive traversal is required, you will need to add recursion to the function.
Conclusion
These are three common methods for traversing folders in Python. In practice, we can choose different methods to implement folder traversal based on specific circumstances. If you need to traverse recursively, we recommend using the os.walk() method. If you only need to traverse the current folder, we recommend using the os.listdir() or glob.glob() method.