Day 4 – A Python script to search files greater than X days old
Welcome to Day 4 of 101 Days of DevOps. The topic for today is a python script to search for a file greater than X days old. As a DevOps/System Administrator, this is one of the common tasks we encounter as a part of our daily job. This kind of script is helpful when your server is low in disk space and needs to search for older files and then delete them.
On Day 1, we discussed OS Module. As part of the discussion, we discussed os.walk(). If you want to brush your concept, I am attaching the link; else, please check the above video if you directly want to go to the solution. https://www.101daysofdevops.com/courses/101-days-of-devops/lessons/day-1-python-os-module/
So far, our code looks like this, whereby using os.walk(), we are iterating over the /etc directory and then using os.path.join() we combine directory with the filename.
for dirpath, dirname, filename in os.walk("/etc"): for file in filename: comp_path= os.path.join(dirpath,file)
Once we get the complete path, the next step is to find when the particular file is created to do that with the time help of os.path.getctime(filename).
>>> os.path.getctime("/etc/hosts") 1623687666.8315635
But the output of the above command is in sec. Now it’s time to introduce a new Python module called datetime. Using datetime, we can convert this time second into the local date corresponding to the POSIX timestamp using datetime.datetime.fromtimestamp() and later save that into a variable file_creation_time.
datetime.datetime.fromtimestamp(os.path.getctime("/etc/hosts")) datetime.datetime(2021, 6, 14, 9, 21, 6, 831563)
So far, our code looks like this.
import os import datetime for dir,dirpath,filename in os.walk("/var/log"): for file in filename: complete_path=os.path.join(dir,file) file_creation_time=datetime.datetime.fromtimestamp(os.path.getctime(complete_path))
Now we have the date on which the file is created; the next step is to find the current date. Finding current dates is pretty easy in the datetime module, and we can do that by using datetime.datetime.now(). Save the output of the below command in variable today_date.
>>> datetime.datetime.now() datetime.datetime(2021, 7, 4, 10, 31, 41, 664467)
The next step is to calculate the difference between the current date and the file creation date and use days() to get only the days.
Now depending upon your requirement, use if condition to get the files, e.g., 15 days, and print the output
file_age=15 if time_diff> file_age: print(complete_path, time_diff)
So your complete code will look like this
import os import datetime file_age=15 today_date=datetime.datetime.now() for dir,dirpath,filename in os.walk("/var/log"): for file in filename: complete_path=os.path.join(dir,file) file_creation_time=datetime.datetime.fromtimestamp(os.path.getctime(complete_path)) time_diff=(today_date-file_creation_time).days if time_diff> file_age: print(complete_path, time_diff)
- Try to convert the above code in terms of function. Create a separate function to perform date calculations.
- Write a script to search for files with specific extensions. E.g., you want to search all .jpeg or .txt files in the /var directory.
I am looking forward to you guys joining the amazing journey.