Python快速列出目录所有文件

有时候我们需要快速了解某个目录的所有文件,而且又不想看目录,Linux的 ls -R 能达到这样的效果

运行结果如下:

ex@ex:~/test$ ls -R
.:
1.png  base       Config.ini  EternalBlue.txt  js      main.c
a.out  Cknife.db  css         index.html       key.py  smb.pcapng

./css:
main.css  media.css  reboot.css  reset.css

./js:
jquery.min.js  jquery.simplesidebar.js  jquery-ui.min.js

但是用该命令产生的结果是不能再给 xargs 进行二次处理了,而且结果也不方便程序进行二次处理,所以就可以用到下面这段脚本,兼容Python2和Python3(不对软连接和快捷方式分析,可能会导致死循环)

#!/usr/bin/python

import os
import sys

file = []
root_dir = ''

def get_all_file_name(dir):
    global file
    for v in os.listdir(dir):
        t = dir+'/'+v
        if (os.path.islink(t)):
            continue
        elif (os.path.isdir(t)):
            get_all_file_name(t)
        elif (os.path.isfile(t)):
            #防止命令行传参出错
            t=t.replace(' ','\\ ').replace("'","\\'").replace('"','\\"')
            file += [t]

if(len(sys.argv) > 1):
    root_dir = sys.argv[1]
else:
    root_dir = '.'

get_all_file_name(root_dir)

for v in file:
    print(v)

执行结果如下:

ex@ex:~/test$ fileall
./EternalBlue.txt
./index.html
./a.out
./Cknife.db
./main.c
./css/main.css
./css/reboot.css
./css/media.css
./css/reset.css
./1.png
./Config.ini
./key.py
./base
./smb.pcapng
./js/jquery.min.js
./js/jquery-ui.min.js
./js/jquery.simplesidebar.js

下面这段代码为是由上面的改进而来,可以分析所有文件的类型并进行统计,方便了解不同类型的文件数量,并追踪相关可疑文件,由于需要Linux的 file 命令,所有只能在Linux上使用。

#!/usr/bin/python

import os
import sys

file = []
root_dir = ''

def get_all_file_name(dir):
    global file
    for v in os.listdir(dir):
        t = dir+'/'+v
        if (os.path.islink(t)):
            continue
        elif (os.path.isdir(t)):
            get_all_file_name(t)
        elif (os.path.isfile(t)):
            #防止命令行传参出错
            t=t.replace(' ','\\ ').replace("'","\\'").replace('"','\\"')
            file += [t]

if(len(sys.argv) > 1):
    root_dir = sys.argv[1]
else:
    root_dir = '.'

get_all_file_name(root_dir)

types = []
types_num = []
file_list = []
for v in file:
    result = os.popen('file '+v).read()
    result = result[result.find(' ')+1:]
    if result in types:
        num = types.index(result)
        types_num[num] += 1
        file_list[num] += [v]
    else:
        types += [result]
        types_num += [1]
        file_list += [[v]]

for i in range(len(types)):
    print('('+str(i)+')'+'\t' +
          str(types_num[i])+' times :')

    print(types[i])

arg = input('input number to watch \
file or -1 for quit >>>')

if(int(arg) != -1):
    arg = int(arg)
    for v in file_list[arg]:
        print(v)

运行结果:

ex@ex:~/test$ filetypes
(0) 1 times :
UTF-8 Unicode text

(1) 1 times :
HTML document, UTF-8 Unicode text, with CRLF line terminators

(2) 1 times :
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=d6adff0cb6c78c0bb7f0596f6db1a20ca146603b, not stripped

(3) 1 times :
SQLite 3.x database, last written using SQLite version 3007002

(4) 1 times :
C source, ASCII text

(5) 3 times :
ASCII text, with CRLF line terminators

(6) 1 times :
ASCII text, with very long lines, with CRLF line terminators

(7) 1 times :
JPEG image data, JFIF standard 1.01, resolution (DPI), density 96x96, segment length 16, baseline, precision 8, 520x339, frames 3

(8) 3 times :
ASCII text, with very long lines

(9) 1 times :
HTML document, ASCII text

(10)    1 times :
empty

(11)    1 times :
pcap-ng capture file - version 1.0

(12)    1 times :
ASCII text

input number to watch file or -1 for quit >>>8
./Config.ini
./js/jquery.min.js
./js/jquery-ui.min.js