MATLAB | 如何使用MATLAB获取顶刊《PNAS》绘图(附带近3年图像)

千呼万唤始出来,《PNAS》绘图获取的代码来啦,不过这次研究了半天也没想到如何获取付费文章的绘图,就只下载了免费文章(主要也怕侵权),不过光免费文章的图片三年了也有接近1.7w张了,同时使用代码下载时依旧需要科学上网,因此还是建议大家直接去文末下载我整理好的图片压缩包。

代码也放一下叭,使用方法就命令行运行 getPNASJPG(YEAR) YEAR 为该期刊的年份,例如getPNASJPG(2022),要是出现了啥403的报错,过段时间再运行应该就会自己好起来。。代码如下:

function getPNASJPG(YEAR)
if nargin < 1
    YEAR = 2023;
end
YEAR         = num2str(YEAR);
str_YEAR     = ['d',YEAR(1:3),'0','.y',YEAR];
options      = weboptions('Timeout',inf);
url_archive  = ['https://www.pnas.org/loi/pnas/group/',str_YEAR];
html_archive = webread(url_archive,options);
A_issue      = strfind(html_archive,'past-issue__content__item--all-details d-flex flex-column');
str_issue    = html_archive(A_issue(1)+50:A_issue(1)+100);
S1_issue     = strfind(str_issue,'|');
S2_issue     = strfind(str_issue,'</h2>');
str1_issue   = str_issue(S1_issue(1):S1_issue(2));
str2_issue   = str_issue(S1_issue(2):S2_issue);
num1_issue   = str2num(str1_issue(str1_issue>=48&str1_issue<=57));
num2_issue   = str2num(str2_issue(str2_issue>=48&str2_issue<=57));

ibegin = 1; jbegin = 1; kbegin = 1;
forderName=['Year_',num2str(YEAR)];
if exist(['.\image_',forderName,'\ijkbreak.mat'],'file')
    load(['.\image_',forderName,'\ijkbreak.mat']);
end
if ~exist(['.\image_',forderName],'dir')
    mkdir(['.\image_',forderName]);
end
disp([ibegin,jbegin,kbegin])

for i = ibegin:num2_issue
    url_issue  = ['https://www.pnas.org/toc/pnas/',num2str(num1_issue),'/',num2str(i)];
    html_issue = webread(url_issue,options);
    A_article  = strfind(html_issue,'Research Article');
    Z_article  = strfind(html_issue,'Recent Issues');
    html_issue = html_issue(A_article(1):Z_article(1));

    B_article  = strfind(html_issue,'icon-open-access');
    A_article  = strfind(html_issue,'text-reset animation-underline');
    Z_article  = strfind(html_issue,'title="');
    for j = jbegin:length(B_article)
        tA_article   = A_article(find(B_article(j)<A_article,1));
        url_article  = html_issue(tA_article:Z_article(find(Z_article>tA_article,1)));
        url_article  = url_article(39:end-3);
        url_article  = ['https://www.pnas.org',url_article]; 
        html_article = webread(url_article,options);

        A_JPG   = strfind(html_article,[url_article(find(url_article=='/',1,'last'):end),'/asset/']);
        Z_JPG   = strfind(html_article,'jpg" height=');

        for k = kbegin:length(A_JPG)
            try
            ibegin = i ; jbegin = j; kbegin = k;
            save(['.\image_',forderName,'\ijkbreak.mat'],'ibegin','jbegin','kbegin')
            url_JPG = ['https://www.pnas.org/cms/10.1073',html_article(A_JPG(k):Z_JPG(k)+2)];
            name_JPG = ['.\image_',forderName,'\',url_JPG(find(url_JPG=='/',1,'last')+1:end)];
            websave(name_JPG,url_JPG,options);
            disp(['Downloading Year-',YEAR,...
                 ' Issue-',num2str(i),' Artical-',num2str(j),...
                 ' Pic-',num2str(k),':',url_article(22:end)])
            catch
            end
        end
        kbegin = 1;
    end
    jbegin = 1;
end
end

代码设置了可断点下载,就是可以下载了一半中断程序后过段时间接着下。

同时如果有的时候看到一张图非常好想找找源文章读一读,此代码下载的图像名称就标注了图像的来源,比如对下图名为pnas.2212633120fig06的图感兴趣:

只需要在浏览器输入文章链接:

  • https://www.pnas.org/doi/10.1073/pnas.2212633120

确实就是Fig.6,完全对的上!


部分图像展示

《PNAS》上的图画的好的和画的差的就差别比较大了,大家有选择的学习哈,这里展示部分比较有趣的绘图:

2023


2022


2021


图像获取

百度网盘

提供近三年来图片百度网盘链接,共计约1.7w张:

2023(2.49G-3209张)

链接:
https://pan.baidu.com/s/1YxRmt53jH-_TXGg6zkqtIg?pwd=slan
提取码:slan

2022 上(3.12G-3329张)

链接:
https://pan.baidu.com/s/1vFcEy48oOklW9UOUShVeAA?pwd=slan
提取码:slan

2022 下(3.02G-3359张)

链接:
https://pan.baidu.com/s/1ItVAmS18DcwlCNsM2u5rwg?pwd=slan
提取码:slan

2021 上(2.61G-3077张)

链接:
https://pan.baidu.com/s/1XHYlxR9_s1Ly9LCtlfnrhQ?pwd=slan
提取码:slan

2021 下(3.35G-3887张)

链接:
https://pan.baidu.com/s/1uCUoi_hUUKlZ3kfc2oI4Yw?pwd=slan
提取码:slan

gitee仓库

若网盘失效,可去gitee仓库获取最新网盘链接:

https://gitee.com/slandarer/pnas-figures

猜你喜欢

转载自blog.csdn.net/slandarer/article/details/131375607