utils

1、通过ftp、sftp、wget、爬虫下载相关文件

ftppro

@Project : lb_toolkits

@File : ftppro.py

@Modify Time : 2022/8/11 15:34

@Author : Lee

@Version : 1.0

@Description : 通过ftplib进行ftp文件上传和下载

class lb_toolkits.utils.ftppro.ftppro(ip, user=None, password=None, TLSFlag=False)[源代码]

基类:object

check_read_permission(ftp, dir_path)[源代码]

检查是否有读取目录的权限

close(ftp)[源代码]
connect(timeout=300)[源代码]
downloadFile(remoteFile, localPath, blocksize=1024, skip_download=False, cover=False)[源代码]

通过ftp下载文件

参数:
  • remoteFile (str)

  • localPath (str)

  • blocksize

  • skip_download (bool) -- 是否跳过下载文件,TRUE则不下载文件,直接返回文件名,否则,下载该文件

  • cover (bool) -- 文件存在, 如果cover为TRUE,则删除原文件后重新下载, 如果cover为FALSE,就跳过该文件下载

list_dir_regex(remotePath, regexStr)[源代码]
listdir(dirname, pattern=None)[源代码]
uploadFile(localfile, remotePath, cover=True, blocksize=1024)[源代码]

上传文件 【注意】:本功能未经过测试验证可行性

参数:
  • localfile (str) -- 本地存储路径

  • remotePath (str) -- 远程上传路径

  • remoteFile (str) -- 远程文件

  • block_size (int) -- 上传块大小,单位:byte

sftppro

@Project : lb_toolkits @File : sftppro.py @Modify Time @Author @Version -------------- ------- -------- 2022/7/21 17:09 Lee 1.0 @Description ------------------------------------ 利用paramiko库,通过sftp方式上传或者下载文件 或者路径下的所有文件

class lb_toolkits.utils.sftppro.sftppro(ip=None, port=22, username=None, password=None, PKEY=None, timeout=300)[源代码]

基类:object

DownloadPath(remote_path, local_path, retry=3, okstatus=False, redownload=False, pathdownload=False)[源代码]

# 递归方式下载目录下的所有文件 :param remote_path: :param local_path: :return:

Exec_cmd(command)[源代码]
GetStrTime(s=None)[源代码]
JudgeLocalFileExist(filename)[源代码]

# 判断文件是否存在,存在则返回True,否则返回False :param filename: :return:

JudgeRemoteExist(remote)[源代码]
JudgeRemotePathExist(remote_path)[源代码]
UploadPath(local_path, remote_path, retry=3, okstatus=False, reupload=False, pathupload=False)[源代码]
WriteOK(s, okname)[源代码]

写OK文件,标识文件下载状态 :param s: 输出文件内容 :param okname: OK文件名 :return:

close()[源代码]

关闭sftp通道

connect(timeout=300)[源代码]
download(remotepath, localpath, retry=3, redownload=False, pathdownload=False, okstatus=False)[源代码]

递归方式下载目录下的所有文件 :param remotepath: :param localpath: :param retry: :param redownload: :param pathdownload: :param okstatus: :return:

get(remotepath, localpath, callback=None, chucksize=1024, prefetch=True)[源代码]

从远程下载数据文件 :param remotepath: :param localpath: :param callback: :param chucksize: :param prefetch: :return:

makedirs(path)[源代码]

创建远程路径下的文件夹,如果不存在则直接创建,存在则跳过 :param path: :return:

upload(localpath, remotepath, retry=3, reupload=False, pathupload=False, okstatus=False)[源代码]

通过sftp, 上传文件或者路径下的所有文件 :param localpath: :param remotepath: :param retry: :param reupload: :param pathupload: :param okstatus: :return:

spider

@Project : lb_toolkits @File : spider.py @Modify Time @Author @Version -------------- ------- -------- 2022/7/14 10:28 Lee 1.0 @Description ------------------------------------

class lb_toolkits.utils.spider.spiderdownload(username=None, password=None)[源代码]

基类:object

download(outdir, url, timeout=300, skip_download=False, cover=False)[源代码]

Download a Landsat scene.

参数:
  • identifier (str) -- Scene Entity ID or Display ID.

  • outdir (str) -- Output directory. Automatically created if it does not exist.

  • dataset (str, optional) -- Dataset name. If not provided, automatically guessed from scene id.

  • timeout (int, optional) -- Connection timeout in seconds.

  • skip_download (bool, optional) -- Skip download, only returns the remote filename.

  • cover (bool, optional) -- 如果为TRUE,文件存在则会被覆盖,如果为FALSE,文件存在会跳过下载

返回:

filename -- Path to downloaded file.

返回类型:

str

get_tokens(body, pattern='name="csrf" value="(.+?)"')[源代码]

Get csrf_token and __ncforminfo.

logged_in()[源代码]

Check if the log-in has been successfull based on session cookies.

login(username, password, url_login)[源代码]

Login to URL.

logout(url_logout)[源代码]

Log out from URL.

searchfile(url, pattern='.tif', attrs={})[源代码]
参数:

nowdate

返回:

lb_toolkits.utils.spider.spiderhref(url, pattern=None, attrs={})[源代码]

爬虫获取url中的链接

lb_toolkits.utils.spider.spidertable(url, outname=None, format='dict')[源代码]

爬虫获取URL中的table

lb_toolkits.utils.spider.spidertable1(url)[源代码]

wget

@Project : lb_toolkits

@File : wget.py

@Modify Time : 2023/10/30 14:19

@Author : Lee

@Version : 1.0

@Description :

lb_toolkits.utils.wget.get_wget(path)[源代码]

获取wget的路径

参数:

str (path :) -- wget路径

返回:

wgetpath

返回类型:

str

lb_toolkits.utils.wget.wget(outdir, url, username=None, password=None, token=None, tries=3, skip_download=False, cover=False, timeout=300, continuing=True, wgetpath=None, options=None)[源代码]

通过wget方式进行数据下载