Stroustrup最近发布了一系列文章,揭露了有关C ++的流行神话。第五个神话是:“ C ++仅适用于大型,复杂的程序”。为了揭穿它,他编写了一个简单的C ++程序,可下载一个网页并从中提取链接。这里是:
#include <string>
#include <set>
#include <iostream>
#include <sstream>
#include <regex>
#include <boost/asio.hpp>
using namespace std;
set<string> get_strings(istream& is, regex pat)
{
set<string> res;
smatch m;
for (string s; getline(is, s);) // read a line
if (regex_search(s, m, pat))
res.insert(m[0]); // save match in set
return res;
}
void connect_to_file(iostream& s, const string& server, const string& file)
// open a connection to server and open an attach file to s
// skip headers
{
if (!s)
throw runtime_error{ "can't connect\n" };
// Request to read the file from the server:
s << "GET " << "http://" + server + "/" + file << " HTTP/1.0\r\n";
s << "Host: " << server << "\r\n";
s << "Accept: */*\r\n";
s << "Connection: close\r\n\r\n";
// Check that the response is OK:
string http_version;
unsigned int status_code;
s >> http_version >> status_code;
string status_message;
getline(s, status_message);
if (!s || http_version.substr(0, 5) != "HTTP/")
throw runtime_error{ "Invalid response\n" };
if (status_code != 200)
throw runtime_error{ "Response returned with status code" };
// Discard the response headers, which are terminated by a blank line:
string header;
while (getline(s, header) && header != "\r")
;
}
int main()
{
try {
string server = "www.stroustrup.com";
boost::asio::ip::tcp::iostream s{ server, "http" }; // make a connection
connect_to_file(s, server, "C++.html"); // check and open file
regex pat{ R"((http://)?www([./#\+-]\w*)+)" }; // URL
for (auto x : get_strings(s, pat)) // look for URLs
cout << x << '\n';
}
catch (std::exception& e) {
std::cout << "Exception: " << e.what() << "\n";
return 1;
}
}
让我们向Stroustrup展示实际上是什么小型可读程序。
- 下载
http://www.stroustrup.com/C++.html
列出所有链接:
http://www-h.eng.cam.ac.uk/help/tpl/languages/C++.html http://www.accu.org http://www.artima.co/cppsource http://www.boost.org ...
您可以使用任何语言,但不允许使用第三方库。
优胜者
C ++的答案获得了选票,但它依赖于第三方库(规则不允许),并且与另一个紧密竞争者Bash依赖于被黑客入侵的HTTP客户端(不适用于HTTPS, gzip,重定向等)。因此,沃尔夫拉姆无疑是赢家。在大小和可读性方面非常接近的另一个解决方案是PowerShell(通过注释的改进),但是它并没有引起太多关注。主流语言(Python,C#)也非常接近。
Content-Type: text/html; charset=UTF-8
……我要给他发送电子邮件。
boost/asio
,使用起来有这是一个第三方库。我的意思是不将url / tcp提取作为其标准库的一部分的语言将如何竞争?