当前位置：网站首页>The Missing Semester of Your CS Education

The Missing Semester of Your CS Education

2022-08-10 23:44:00 【ek1ng】

Mainly remember yourselfvim还不太会用,So remember this coursevimTeaching is good,Just take the time to look at the whole course,重点看一下vim的使用,The version I'm looking at is from the community中文翻译版的文档,Most of these tools, I have been able to skillfully use,So I didn't watch the English video and it felt like a waste of time..

shell

首先的话shellIn the course of the first class and second class speak,But because the content is the same,So we wrote it together.

课程概览与 shell

课程内容

shellindeed used frequently,But don't use,Learn something that is not commonly usedshellCommand higher improve their efficiency.

Before watching the course, think of what you are usingwindows的powershellso ugly,Don't always use the virtual machinemanajro的shell,wslIf you don't wear it,所以shellThis tool works for mewindows用户来说,自带powershell确实不太好用,I chose to installPowershell7.1 + oh my posh + 主题JanDeDobbeleer,It looks pretty good now.

after tossingPowershell后,shellWith the coursebash,Then I thought about itgit bash,于是又给git bash也配置了一下,现在已经可以在cmdOpen it and have a theme that looks good

Feel good after changing the theme,接下来就开始shelllearning of commands.

Simply record something that you are not familiar with before

shell 中的路径是一组被分割的目录,在 Linux 和 macOS 上使用 / 分割,而在Windows上是 \.
当前工作目录可以使用 pwd 命令来获取.
ls -aParameters can be listed with.的文件,-lParameters can list file information in more detail
cat < hello.txt > hello2.txt catIt turns out that it's not just printing content Instead concatenate the file and print to the output device,so that we can redirectcat的输出和输入,This command willhello.txt的内容复制给hello2.txt
关于 shell,有件事我们必须要知道.|、>、和 < 是通过 shell 执行的,而不是被各个程序单独执行. echo 等程序并不知道 | 的存在,它们只知道从自己的输入输出流中进行读写.

That's about all the knowledge,Next there is a smalllab需要完成

课后练习

The previous content is relatively simple

Next ask forsemesterEnter this file line by line #!/bin/sh curl –head –silent https://missing.csail.mit.edu

doing doing feeling forecho和cat有点懵逼,查了一下cat输出文件内容echo,echois the output string content,It should be said that both of them receive input and output on the standard output device,is the received input is different,So we want directly on the command line if we receive the input of the string,需要使用echo,If we want to receive input from a file,需要使用echo,As for the output we can use>来对标准输出重定向,到这个semester里面.

So simple to use, we directly useecho Just output the content of the string to a file.

首先#!/bin/shWriting is a bit tricky, # 在Bash中表示注释,而 ! 即使被双引号（"）包裹也具有特殊的含义. 单引号（'）则不一样,此处利用这一点解决输入问题.

Second, if you use it twice>写入,The second write overwrites the first write,This should be called overwriting,用>>Just add content

next requirements:

尝试执行这个文件.例如,将该脚本的路径（./semester）输入到您的shell中并回车.如果程序无法执行,请使用 ls 命令来获取信息并理解其不能执行的原因.

First of all, it cannot be executed. According to popular understanding, there is norootbecauserootcan do anything,Specifically, we usels -lAfter viewing permissions file for file ownerek1ng 用户组users The permissions of others are rw- r– r– ,The current user isek1ng,Only read and write permissions,没有运行权限x,如果是rwx就可以运行啦.Then we need to usechmod提升权限,来执行文件.

使用chmod 777 增加执行权限后,就可以运行啦.

使用 | 和 > ,将 semester 文件输出的最后更改日期信息,write to the home directory last-modified.txt 的文件中

使用管道符|实现就可以

写一段命令来从 /sys 中获取笔记本的电量信息,或者台式机 CPU 的温度.

不知道为什么在vmware里找不到,Maybe I'm not using it right

Shell 工具和脚本

课程内容

变量

挺神奇的,foo = bar （使用空格隔开）is not working correctly,because the interpreter calls the programfoo 并将 = 和 bar作为参数. 在shellScript used in space will have the effect of splitting parameters,Sometimes may cause confusion,Be sure to double check.

Bash中的字符串通过' 和 "Separator is defined,But they don't mean the same.以'The defined string is a literal string,variables in it are not escaped,而 "The defined string will replace the variable value.

bashMany special variables are used to represent parameters、Error codes and related variables.

$0 - 脚本名
1 到 9 - 脚本的参数.
[email protected] - 所有参数
$# - 参数个数
$? - 前一个命令的返回值
$$ - Process ID of the current script
!! - Complete previous command,包括参数.常见应用：When you because of insufficient permissions to perform the command failed,可以使用 sudo !!再尝试一次.
$_ - 上一条命令的最后一个参数.if you are using interactive shell,你可以通过按下 Esc 之后键入 . 来获取这个值.

命令通常使用 STDOUTTo return the output values,使用STDERR to return the error and error code,Make it easier for scripts to report errors in a more friendly way.返回值0表示正常执行,其他所有非0The return value indicates that an error occurred.

Multiple commands on the same line can be used;分隔.程序 true The return code is always0,false The return code is always1.

命令替换

通过 ( CMD ) 这样的方式来执行CMD 这个命令时,它的输出结果会替换掉 ( CMD ) .

进程替换

<( CMD ) 会执行 CMD 并将结果输出到一个临时文件中,并将 <( CMD ) 替换成临时文件名.

运行脚本

#!/bin/bash

echo "Starting program at $(date)" # date会被替换成日期和时间

echo "Running program $0 with $# arguments with pid $$"

for file in "[email protected]"; do
    grep foobar "$file" > /dev/null 2> /dev/null
    # 如果模式没有找到,则grep退出状态为 1
    # 我们将标准输出流和标准错误流重定向到Null,因为我们并不关心这些信息
    if [[ $? -ne 0 ]]; then
        echo "File $file does not have any foobar, adding one"
        echo "# foobar" >> "$file"
    fi
done

通配（globbing）

通配符

When you want to use wildcards for matching,你可以分别使用 ? 和 * to match one or any characters.

花括号`{}`

you have a series of instructions,其中包含一段公共子串时,可以用花括号来自动展开这些命令.Can be used to move or convert files in batches.


convert image.{png,jpg}
# 会展开为
convert image.png image.jpg

cp /path/to/project/{foo,bar,baz}.sh /newpath
# 会展开为
cp /path/to/project/foo.sh /path/to/project/bar.sh /path/to/project/baz.sh /newpath

# Can also be used in combination with wildcards
mv *{.py,.sh} folder
# will move all *.py 和 *.sh 文件

mkdir foo bar

# 下面命令会创建foo/a, foo/b, ... foo/h, bar/a, bar/b, ... bar/h这些文件
touch {foo,bar}/{a..h}
touch foo/x bar/y

shebang

对于如下代码

#!/usr/local/bin/python
import sys
for arg in reversed(sys.argv[1:]):
    print(arg)

The kernel knows to use python 解释器而不是 shell command to run this script,because the first line at the beginning of the script shebang

shell工具

It's important that you know that some problems can be solved with the right tools,The choice of which tool is not so important.

find

找文件,也可以用FD

grep

找文件内容

查找 shell 命令

history 可以使用ctrl + R 进行搜索也可以使用 | grepto find the desired history command

课后练习

阅读 man ls ,然后使用ls The command does the following：

所有文件（包括隐藏文件）：-a
File printing outputs in a human understandable format (例如,使用454M 而不是 454279954) : -h
Files are sorted in most recently accessed order：-t
Display output results in colored text--color=auto

编写两个bash函数 marco 和 polo 执行下面的操作. 每当你执行 marco 时,The current working directory should be saved in one form or another,当执行 polo 时,No matter what now in directory,都应当 cd back to execute marco 的目录. 为了方便debug,You can write code in a separate file marco.sh 中,并通过 source marco.sh命令,（重新）加载函数.通过source to load the function,随后可以在 bash 中直接使用.

#!/bin/bash
marco(){
    echo "$(pwd)" > $HOME/marco_history.log
    echo "save pwd $(pwd)"
}
polo(){
    cd "$(cat "$HOME/marco_history.log")"
}

Suppose you have a command,it rarely goes wrong.So in order to be able to debug it in the event of an error,It takes a lot of time to reproduce the error and capture the output. 编写一段bash脚本,Run the following script until it wrong,log its stdout and stderr streams to a file,and output everything at the end. 加分项：Reports how many times the script ran before failing.

debug.sh

count=1

while true
do
    ./buggy.sh 2> out.log
    if [[ $? -ne 0 ]]; then
        echo "failed after $count times"
        cat out.log
        break
    fi
    ((count++))

done

buggy.sh

#!/usr/bin/env bash

n=$(( RANDOM % 100 ))

if [[ n -eq 42 ]]; then
    echo "Something went wrong"
    >&2 echo "The error was using magic numbers"
    exit 1
fi

echo "Everything went according to plan"

您的任务是编写一个命令,It can recursively find all files in a folderHTML文件,and compress them intozip文件.注意,即使文件名中包含空格,Your command should also execute correctly（提示：查看 xargs的参数-d）

tip:有些命令,例如tar you need to accept input from parameters.这里我们可以使用xargs 命令,It can take content from standard input as an argument.

先创建一些html文件

mkdir html_root
 cd html_root
 touch {1..10}.html
 mkdir html
 cd html
 touch xxxx.html

find html_root -name "*.html" | xargs -d '\n' tar -cvzf html.zip

(进阶) Write a command or script is recursive search the most recently used files in the folder.更通用的做法,Can you list files by most recent usage？

find . -type f -print0 | xargs -0 ls -lt | head -1

当文件数量较多时,The above solution will give the wrong result,解决办法是增加 -mmin条件,The most recently modified files are initially screened before handing over to thels进行排序显示 find . -type f -mmin -60 -print0 | xargs -0 ls -lt | head -10

vim

首先vimThe study is really more rigid for me,随着linux的使用越来越多,I just go into insert mode for now,然后用dd删除,:wq或者:q退出,Nothing else will work,It is necessary to studyvim的使用

vim的设计

Vim 避免了使用鼠标,因为那样太慢了;Vim 甚至避免用上下左右键因为那样需要太多的手指移动.

操作模式

Vim的设计以大多数时间都花在阅读、浏览和进行少量编辑改动为基础,因此它具有多种操作模式：

正常模式：在文件中四处移动光标进行修改
插入模式：插入文本
替换模式：替换文本
可视化（一般,行,块）模式：选中文本块
命令模式：用于执行命令

在不同的操作模式下,键盘敲击的含义也不同.在默认设置下,Vim会在左下角显示当前的模式. Vim启动时的默认模式是正常模式.通常你会把大部分时间花在正常模式和插入模式.

你可以按下 <ESC> （退出键）从任何其他模式返回正常模式. 在正常模式,键入 i 进入插入模式, R 进入替换模式, v 进入可视（一般）模式, V 进入可视（行）模式, Ctrl-V进入可视（块）模式, : 进入命令模式.

编程思想

Vim 最重要的设计思想是 Vim 的界面本身是一个程序语言.键入操作（以及他们的助记名）本身是命令, 这些命令可以组合使用. 这使得移动和编辑更加高效,特别是一旦形成肌肉记忆.

如何使用

插入文本

按iEdit text after entering insert mode

缓存, 标签页, 窗口

Vim 会维护一系列打开的文件,称为“缓存”.一个 Vim 会话包含一系列标签页,Each tab contains a series of windows（分隔面板）.每个窗口显示一个缓存.跟网页浏览器等其他你熟悉的程序不一样的是, 缓存和窗口不是一一对应的关系;窗口只是视角.A cache can be_多个_窗口打开,even in the same Multiple windows open within tabs.This function is actually very useful,For example, when viewing different parts of the same file.vim -o file1 file2可以打开多个窗口,:split file2 新建一个窗口,:vsplit file2New vertical split window

命令行模式

:q退出
:w保存
:wq保存退出
:e filename打开要编辑的文件
ls显示打开的缓存
help name打开name的帮助文档

如何移动光标

多数时候你会在正常模式下,使用移动命令在缓存中导航.在 Vim 里面移动也被称为 “名词”, 因为它们指向文字块.

基本移动: hjkl （左, 下, 上, 右）(Feels like up and down is enough)
词： w （下一个词）, b （词初）, e （词尾）
行： 0 （行初）, ^ （第一个非空格字符）, $ （行尾）
屏幕： H （屏幕首行）, M （屏幕中间）, L （屏幕底部）
翻页： Ctrl-u （上翻）, Ctrl-d （下翻）
文件： gg （文件头）, G （文件尾）
行数： :{行数}<CR> 或者 {行数}G ({行数}为行数)
杂项： % （找到配对,比如括号或者 /**/ 之类的注释对）
查找： f{字符}, t{字符},F{字符},T{字符}
- 查找/到向前/向后在本行的{字符}
- , / ; 用于导航匹配
搜索: /{正则表达式}, n / N 用于导航匹配

选择

在可视化模式:

可视化：v
可视化行： V
可视化块：Ctrl+v

可以用hjkl move command to select,In this way, you can select a large section and delete it.,Had been in the normal modedd删除效率·1很低

编辑

所有你需要用鼠标做的事, 你现在都可以用键盘：采用编辑命令和移动命令的组合来完成. 这就是 Vim 的界面开始看起来像一个程序语言的时候.Vim 的编辑命令也被称为 “动词”, 因为动词可以施动于名词.`

i进入插入模式
- 但是对于操纵/编辑文本,不单想用退格键完成
O / o 在之上/之下插入行
d{移动命令}删除 {移动命令}
- 例如, dw 删除词, d$ 删除到行尾, d0 删除到行头.
c{移动命令}改变 {移动命令}
- 例如, cw 改变词
- 比如 d{移动命令} 再 i
x 删除字符（等同于 dl）
s 替换字符（等同于 xi）
可视化模式 + 操作
- 选中文字, d 删除或者 c 改变
u 撤销, <C-r> 重做
y 复制 / “yank” （其他一些命令比如 d 也会复制）
p 粘贴
更多值得学习的: 比如 ~ 改变字符的大小写

The specified for several times

你可以用一个计数来结合“名词”和“动词”,这会The specified for several times.

3w 向前移动三个词
5j 向下移动5行
7dw 删除7个词

The rest really can't be seen.,I feel that after reading the above, it can be basically used.,The rest of the content needs to be usedvimcontinuous use of search engines in the process,Then look for better solutions to current problems to improve yourself.

课后练习

完成vimtutor(vim自带的教程,在命令行输入vim即可)
在使用中学习,而不是在记忆中学习

vimtutor主要是vimComes with a tutorial,Learn better in practicevim

This is useful,就是操作符 + 操作对象 Action objects can also be used individually,比如wis to move from the cursor to the beginning of the next word

终于做完了,I didn't see it later because it didn't feel very useful.,我在linuxEditing code is also more accustomed to using the mousevscode这些,So how do the curly brackets match? This kind of code writing operation has not been tried..

下载我们的vimrc,然后把它保存到 ~/.vimrc. Read through this annotated detailed document （用 Vim!）, 然后观察 Vim 在这个新的设置下看起来和使用起来有哪些细微的区别.

First on the left there is a help the number of statistical line plug-in,Then there is code highlighting,The use of four arrow keys is prohibited and there is a corresponding error message

,didn't look closely,大概这样,English comments are a bit hard to read.

安装和配置一个插件： ctrlp.vim.
自定义 CtrlP：添加 configuration 到你的 ~/.vimrc 来用按 Ctrl-P 打开 CtrlP

折磨 Not going to change nowvim编辑代码,好烦,don't do this

further customize your ~/.vimrc and install more plugins. The easiest way to install a plugin is to use Vim 的包管理器,即使用 vim-plug 安装插件

I don't know why the plugin can't be installed,吐了

All in all, a lot of gains,At least after reading it, it can be used effectivelyvim解决大多数问题,Efficiency is definitely not as efficient as a text editor controlled by a mouse.,But it still works,Anyway, the more you use it, the more skilled it becomes. The efficiency will naturally go up..

Data Wrangling

The main content of this section is data processing,Convert data stored in one format to another.

Data used to organize and related application scenarios

Log processing is usually a typical usage scenario,Because we often need to find certain information in the log,Reading through the log is impractical in this case.

Study the system log,用sshconnect to my own server(39.108.253.105),See which users have tried to log into our servers：

ssh -l root 39.108.253.105 journalctl

Can get much information,At this time, if you want to get useful information, you need to filter

ssh -l root 39.108.253.105 journalctl | grep sshd

sshd是sshservice process name,will find that there are still many,我们来改进一下：

ssh -l root 39.108.253.105 'journalctl | grep sshd | grep "Disconnected from"' | less

What do the extra quotation marks do?？这么说吧,Our log is a very large file,It is a waste of traffic to transfer such a large file stream directly to our local computer for filtering..So we take another approach,We first filter the text content on the remote machine,Then transfer the result to this machine. less Create a file pager for us,Allows us to browse longer texts by flipping pages.

To further save traffic,We can even save the currently filtered logs to a file,This way you don't need to access the file over the network again in the future：

ssh -l root 39.108.253.105 'journalctl | grep sshd | grep "Disconnected from"' > ssh.log
less ssh.log

报错permission denied,不知道为啥…

我们先研究一下 sed this very powerful tool.sed is a text based editored构建的”流编辑器” .在 sed 中,You are basically modifying the file with some short commands,instead of directly manipulating the contents of the file（Although you can also choose to do this）.There are many related command lines,但是最常用的是 s,即替换命令,例如我们可以这样写：

ssh -l root 39.108.253.105 journalctl | grep sshd | grep "Disconnected from"| sed 's/.*Disconnected from //'

正则表达式

/.*Disconnected from /,Regular expressions usually start with/开始和结束.

. 除换行符之外的”任意单个字符”
* 匹配前面字符零次或多次
+ 匹配前面字符一次或多次
[abc] 匹配 a, b 和 c 中的任意一个
(RX1|RX2) 匹配RX1 或 RX2
^ 行首
$ 行尾
{num1,num2}匹配num1-num2preceding characters

回过头我们再看/.*Disconnected from /,We will find the regular expressions can match any begin with a number of any characters,and then include”Disconnected from “的字符串.This is also formally what we want.sed You can also do some things very easily,For example, print the matched content,Multiple replacement searches in one call, etc..

want to match the text after the username,Especially when the username here can contain spaces,This problem becomes very tricky！Here we need to do is to match一整行：

| sed -E 's/.*Disconnected from (invalid |authenticating )?user .* [^ ]+ port [0-9]+( \[preauth\])?$//'

The beginning part is the same as before,随后,We match two types of“user”（Distinguish based on two prefixes in logs）.And then we match all characters that belong to the username.接着,match any word（[^ ]+ will match any sequence that is not empty and does not contain spaces）.followed by matching single“port”And behind it a bunch of Numbers,and possible suffixes[preauth],And then match the end of each line.

问题还没有完全解决,The contents of the log are all replaced with empty strings,The entire content of the log has thus been deleted.We actually want to be able to put the username保留下来.对此,我们可以使用“捕获组（capture groups）”来完成.the text matched by the regular expression in parentheses,are stored in a series of numbered capturing groups.The contents of the capturing group can be used when replacing strings（Some of the regular expression engine even support to replace the expression itself）,例如\1、 \2、\3等等,因此可以使用如下命令：

| sed -E 's/.*Disconnected from (invalid |authenticating )?user (.*) [^ ]+ port [0-9]+( \[preauth\])?$/\2/'

It hurts to see this,Regular expressions are so complicated

back to data curation

sed You can also do a lot of various interesting things,e.g. text injection：(使用 i 命令),打印特定的行 (使用 p命令),select specific rows based on index etc..详情请见man sed

好难….Really no power to see the following content,It's a little hard to understand just by looking at it,I probably understood the meaning and forgot after two days.

课后练习

study this short Interactive Regular Expression Tutorial.

I feel like the interactive tutorial is not bad,I have some impressions of regular rules in use

\d匹配数字,\s匹配空格,\w匹配字母,^表示行首,$表示行末,()表示一个组,

The content is actually pretty good,But today is a little touched,就这样叭,Don't do the latter,I really feel a little annoying

Command-line Environment

Learn how to execute multiple different processes simultaneously and track their status、如何停止或暂停某个进程以及如何使进程在后台运行,learn something that will improve your shell 及其他工具的工作流的方法,这主要是通过定义别名或基于配置文件对其进行配置来实现的.

It is mainly about using the command line to view the process of the current machine and the configuration of the command line environment..

任务控制

众所周知,<C-c>Can stop the execution of command line commands.

结束进程

shell 会使用 UNIX 提供的信号机制执行进程间通信.当一个进程接收到信号时,它会停止执行、处理该信号并基于信号传递的信息来改变其执行.就这一点而言,信号是一种软件中断.

当我们输入 Ctrl-C 时,shell 会发送一个SIGINT 信号到进程.

下面是一个捕获SIGINTsignal and ignore its code,Stop this program requiredSIGQUIT,输入Ctrl-\就可以.

#!/usr/bin/env python
import signal, time

def handler(signum, time):
    print("\nI got a SIGINT, but I am not stopping")

signal.signal(signal.SIGINT, handler)
i = 0
while True:
    time.sleep(.1)
    print("\r{}".format(i), end="")
    i += 1

It's weird that I can't seem to usectrl + \发送sigquit信号,然后我又去git bashinterview,I can't stop discovering,Is there something wrong with my keyboard?

And I can play normally^\

I don't understand where the problem is,It should not be caused by differences in operating systems.,因为bashnot simulatedlinuxthe command line

虚拟机里面的manjarocan stop normally,I think the root of the problem isgitbash模拟的linuxThe environment is not real enough…因为shellEssentially an application that interacts with the operating system kernel,然后windowsshould not be used^\发送SIGQUIT信号的,should only be^C发送SIGINT信号,所以powershell和git bashfailed to stop the process,但是manjaro里面的zshSuccessful stopped.

The above is mainly aboutSIGINT和SIGNQUIT命令.SIGTERM 则是一个更加通用的、也更加优雅地退出信号.为了发出这个信号我们需要使用 kill 命令, 它的语法是： kill -TERM <PID>.

暂停和后台执行进程

信号可以让进程做其他的事情,而不仅仅是终止它们.例如,SIGSTOP 会让进程暂停( Ctrl-Z ),我们可以使用 fg 或 bg 命令恢复暂停的工作.它们分别表示在前台继续或在后台继续,jobs 命令会列出当前终端会话中尚未完成的全部任务.

后台的进程仍然是您的终端进程的子进程,一旦您关闭终端（会发送另外一个信号SIGHUP）,这些后台的进程也会终止.为了防止这种情况发生,您可以使用 nohup (一个用来忽略 SIGHUP 的封装) 来运行程序.

For example, I recentlyqqThe bot hangs on the association's server,then if i need to letqq机器人在sshContinue to run in case of disconnection,要么使用screensuspend a terminal,要么就用nohupLet the closing of the terminal also not affectqqThe robot background process.可以使用百分号 + 任务编号（jobs 会打印任务编号）来选取该任务.

命令中的 & 后缀可以让命令在直接在后台运行,这使得您可以直接在 shell 中继续做其他操作.

The following command line interaction demonstrates some of the above knowledge,比如说用nohupSuspended child process of the current terminal2,因为用了nohup所以说SIGHUPThis signal doesn't workkill这个进程,当然如果直接killThis process is still possible.

终端多路复用

当您在使用命令行时,您通常会希望同时执行多个任务.举例来说,您可以想要同时运行您的编辑器,并在终端的另外一侧执行程序.尽管再打开一个新的终端窗口也能达到目的,使用终端多路复用器则是一种更好的办法.

I don't feel like I need it now,There is a certain learning cost, and after learning it, it is not easy to forget,Just know that there istmuxsuch a tool

别名

To avoid repeating a long list of commands with many options,shell支持设置别名

alias alias_name="command_to_alias arg1 arg2"

# 创建常用命令的缩写
alias ll="ls -lh"

# 能够少输入很多
alias gs="git status"
alias gc="git commit"
alias v="vim"

# 手误打错命令也没关系
alias sl=ls

# 重新定义一些命令行的默认行为
alias mv="mv -i"           # -i prompts before overwrite
alias mkdir="mkdir -p"     # -p make parent dirs as needed
alias df="df -h"           # -h prints human readable format

# 别名可以组合使用
alias la="ls -A"
alias lla="la -l"

# 在忽略某个别名
\ls
# 或者禁用别名
unalias la

# 获取别名的定义
alias ll
# 会打印 ll='ls -lh'

在默认情况下 shell 并不会保存别名.为了让别名持续生效,您需要将配置放进 shell 的启动文件里,像是.bashrc 或 .zshrc

配置文件（Dotfiles）

Many programs are configured in plain text format called点文件configuration file to complete（之所以称为点文件,是因为它们的文件名以 . 开头,例如 ~/.vimrc.也正因为此,它们默认是隐藏文件,ls并不会显示它们）.

实际上,很多程序都要求您在 shell 的配置文件中包含一行类似 export PATH="$PATH:/path/to/program/bin" 的命令,这样才能确保这些程序能够被 shell 找到.

There are also some other tools available through点文件进行配置：

bash - ~/.bashrc, ~/.bash_profile
git - ~/.gitconfig
vim - ~/.vimrc 和 ~/.vim 目录
ssh - ~/.ssh/config
tmux - ~/.tmux.conf

远端设备（ssh）

说到sshI have to recommendtermius辣,I used this on the recommendation of the association's seniorssh客户端,真不戳.

通过如下命令,您可以使用 ssh 连接到其他服务器：

ssh [email protected]

ssh 的一个经常被忽视的特性是它可以直接远程执行命令. ssh [email protected] ls 可以直接在用foobar的命令下执行 ls 命令. 想要配合管道来使用也可以, ssh [email protected] ls | grep PATTERN 会在本地查询远端 ls 的输出而 ls | ssh [email protected] grep PATTERN 会在远端对本地 ls 输出的结果进行查询.

关于sshExecute commands remotely,In the context of a data arrangement also has the corresponding application.

SSH 密钥

基于密钥的验证机制使用了密码学中的公钥,我们只需要向服务器证明客户端持有对应的私钥,而不需要公开其私钥.这样您就可以避免每次登录都输入密码的麻烦了秘密就可以登录.不过,私钥(通常是 ~/.ssh/id_rsa 或者 ~/.ssh/id_ed25519) 等效于您的密码,所以一定要好好保存它.

密钥生成

使用 ssh-keygen 命令可以生成一对密钥：

ssh-keygen -o -a 100 -t ed25519 -f ~/.ssh/id_ed25519

基于密钥的认证机制

ssh 会查询 .ssh/authorized_keys 来确认那些用户可以被允许登录

通过 SSH 复制文件

ssh+tee, 最简单的方法是执行 ssh 命令,然后通过这样的方法利用标准输入实现 cat localfile | ssh remote_server tee serverfile.回忆一下,tee 命令会将标准输出写入到一个文件;
scp ：当需要拷贝大量的文件或目录时,使用scp 命令则更加方便,因为它可以方便的遍历相关路径.语法如下：scp path/to/local_file remote_host:path/to/remote_file;
rsync 对 scp 进行了改进,它可以检测本地和远端的文件以防止重复拷贝.它还可以提供一些诸如符号连接、权限管理等精心打磨的功能.甚至还可以基于 --partial标记实现断点续传.rsync 的语法和scp类似;

利用sshImplement the port that listens on the remote device

本地端口转发

本地端口转发,即远端设备上的服务监听一个端口,而您希望在本地设备上的一个端口建立连接并转发到远程端口上.

例如,我们在远端服务器上运行 Jupyter notebook 并监听 8888 端口. 然后,建立从本地端口 9999 的转发,使用 ssh -L 9999:localhost:8888 [email protected]_server .这样只需要访问本地的 localhost:9999 即可.

远程端口转发

感觉用处不大

课后练习

我们可以使用类似 ps aux | grep command like this to get the task's pid ,Then you can base onpid to end these processes.But we actually have a better way to do it.在终端中执行 sleep 10000 这个任务.然后用 Ctrl-Z The switch to the background and use bgto continue to allow it.现在,使用 pgrep 来查找 pid 并使用 pkill End a process without manual inputpid.(提示：: 使用 -af 标记).

pgrepEquivalent to more convenient filtering out the process you wantpid

If you want a process to end before starting another, 应该如何实现呢？在这个练习中,我们使用 sleep 60 & as a program to execute first.一种方法是使用 wait 命令.try to initiate this hibernate command,Then execute it after it finishes ls 命令.

sleep 60 &
pgrep sleep | wait; ls

但是,if we are in different bash 会话中进行操作,The above method will not work.因为 wait Progress on the pair only.One feature we haven't mentioned before is,kill When the command exits successfully, its status code is 0 ,Other states are wrong0.kill -0 then no signal will be sent,But it will return a not for if the process does not exist0的状态码.请编写一个 bash 函数 pidwait ,它接受一个 pid 作为输入参数,then wait until the process ends.您需要使用 sleep to avoid waste CPU 性能.

pidwait()
{
   while kill -0 $1 #loop until the process ends
   do
   sleep 1
   done
   ls
}

Terminal multiplexing does not need to be done at the moment,sshHave more skilled use at ordinary times more so don't do,The configuration file is a bit lazy,先到这里吧,Found two problems with the Chinese documentation for this course,By the way, I picked up two for the warehouse.pr哈哈哈

Git

Git 的数据模型

Git Have a well-designed model,这使其能够支持版本控制所需的所有特性,such as maintenance history、Support branching and facilitate collaboration.

快照

在Git的术语里,文件被称作Blob对象（数据对象）,也就是一组数据.directory is called“树”,it combines the name with Blob object or tree object to map（Make a directory can contain other directories）.快照则是被追踪的最顶层的树.例如,A tree might look like this：

<root> (tree)
|
+- foo (tree)
|  |
|  + bar.txt (blob, contents = "hello world")
|
+- baz.txt (blob, contents = "git is wonderful")

This top-level tree contains two elements,一个名为 “foo” 的树（it itself contains ablob对象 “bar.txt”）,以及一个 blob 对象 “baz.txt”.

history modeling：关联快照

What does a version control system have to do with snapshots?？线性历史记录是一种最简单的模型,它包含了一组按照时间顺序线性排列的快照.But for various reasons,Git no such model.

在 Git 中,历史记录是一个由快照组成的有向无环图.注意,Snapshot has multiple“父辈”而非一个,Because a snapshot may come from multiple parents.例如,Merged two branches.

在 Git 中,These snapshots are called“提交”.When representing these historical commit records in a visual way,Looks almost like this：

o <-- o <-- o <-- o
            ^  
             \
              --- o <-- o

上面是一个 ASCII code composition diagram,其中的 o means a commit（快照）.

The arrow points to the parent of the current commit（这是一种“在...之前”,而不是“在...之后”的关系）.

The data model and its pseudocode representation

// 文件就是一组数据
type blob = array<byte>

// 一个包含文件和目录的目录
type tree = map<string, tree | blob>

// 每个提交都包含一个父辈,元数据和顶层树
type commit = struct {
    parent: array<commit>
    author: string
    message: string
    snapshot: tree
}

Objects and memory addressing

Git 中的对象可以是 blob、tree or commit：

type object = blob | tree | commit

Git 在储存数据时,所有的对象都会基于它们的 SHA-1 哈希进行寻址.Hash value is very critical.

objects = map<string, object>

def store(object):
    id = sha1(object)
    objects[id] = object

def load(id):
    return objects[id]

Blobs、The tree is the same as the commit,它们都是对象.当它们引用其他对象时,它们并没有真正的在硬盘上保存这些对象,而是仅仅保存了它们的哈希值作为引用.

<root> (tree)
|
+- foo (tree)
|  |
|  + bar.txt (blob, contents = "hello world")
|
+- baz.txt (blob, contents = "git is wonderful")

The tree in the example looks like this：

100644 blob 4448adbf7ecd394f42ae135bbeed9676e894af85    baz.txt
040000 tree c68d233a33c5c06e0340e4c224f0afca87c8ce87    foo

The tree itself will contain some pointers to other things,例如 baz.txt (blob) 和 foo (树).如果我们用 git cat-file -p 4448adbf7ecd394f42ae135bbeed9676e894af85,i.e. view by hash value baz.txt 的内容,会得到以下信息：

git is wonderful

引用

至此commit就可以用SHA-1The hash value to tag,But the hash value is not easy to remember,因此需要引用（references）,引用是指向提交的指针.与对象不同的是,它是可变的（引用可以被更新,指向新的提交）.例如,master References usually point to the latest commit on the master branch.这样,Git 就可以使用诸如 “master” Such a human readable name to represent the history of a particular submission,instead of using a long string of hexadecimal characters.

通常情况下,we would want to know“Our current location”,and mark it down.So when we create a new commit,we can know its relative position（如何设置它的“父辈”）.在 Git 中,Our current location has a special index,它就是 “HEAD”.

仓库

现在我们可以给出 Git 仓库的定义：对象 和 引用.

在硬盘上,Git 仅存储对象和引用：因为其数据模型仅包含这些东西.所有的 git Commands correspond to operations on the commit tree,such as adding objects,Add or remove references.

暂存区

Staging area and data model are not related,But it is part of the Chuangjie submission interface.

我们先来理解下 Git 工作区、暂存区和版本库概念：

工作区：就是你在电脑里能看到的目录.
暂存区：英文叫 stage 或 index.一般存放在 .git 目录下的 index 文件（.git/index）中,所以我们把暂存区有时也叫作索引（index）.
版本库：工作区有一个隐藏目录 .git,这个不算工作区,而是 Git 的版本库.

The picture below is very clear

Git常用操作

基础

git help <command>: 获取 git 命令的帮助信息
git init: 创建一个新的 git 仓库,其数据会存放在一个名为 .git 的目录下
git status: 显示当前的仓库状态
git add <filename>: 添加文件到暂存区
git commit: 创建一个新的提交
- 如何编写良好的提交信息!
- 为何要编写良好的提交信息
git log: 显示历史日志
git log --all --graph --decorate: 可视化历史记录（有向无环图）
git diff <filename>: 显示与暂存区文件的差异
git diff <revision> <filename>: 显示某个文件两个版本之间的差异
git checkout <revision>: 更新 HEAD 和目前的分支

分支和合并

git branch: 显示分支
git branch <name>: 创建分支
git checkout -b <name>: 创建分支并切换到该分支
- 相当于 git branch <name>; git checkout <name>
git merge <revision>: 合并到当前分支
git mergetool: 使用工具来处理合并冲突
git rebase: 将一系列补丁变基（rebase）为新的基线

远端操作

git remote: 列出远端
git remote add <name> <url>: 添加一个远端
git push <remote> <local branch>:<remote branch>: 将对象传送至远端并更新远端引用
git branch --set-upstream-to=<remote>/<remote branch>: 创建本地和远端分支的关联关系
git fetch: 从远端获取对象/索引
git pull: 相当于 git fetch; git merge
git clone: 从远端下载仓库

撤销

git commit --amend: 编辑提交的内容或信息
git reset HEAD <file>: 恢复暂存的文件
git checkout -- <file>: 丢弃修改

Git 高级操作

git config: Git 是一个高度可定制的工具
git clone --depth=1: 浅克隆（shallow clone）,不包括完整的版本历史信息
git add -p: 交互式暂存
git rebase -i: 交互式变基
git blame: 查看最后修改某行的人
git stash: 暂时移除工作目录下的修改内容
git bisect: 通过二分查找搜索历史记录
.gitignore: 指定故意不追踪的文件

课后练习

如果您之前从来没有用过 Git,推荐您阅读 Pro Git 的前几章,或者完成像 Learn Git Branching这样的教程.重点关注 Git 命令和数据模型相关内容;

我决定看一下Git Branching,I've seen this game before and thought it was good,But never did.Here are some basics that I haven't figured out yetgit 命令

How to switch the currently pointed tocommit记录?

git branchis a new branch git checkout是对HEADthe branch pointed to by the operation,比如说可以git checkout <branchname>切换分支,可以用git checkout <hash>来分离HEAD,HEAD会自动指向master,会表示成master* ,For example, currentlymaster指向C2,HEAD -> master -> C2,执行后会变成HEAD -> C2.HEADis what we are currently pointing tocommit记录,In this way we can switchHEADto the one we want to changecommit记录了.

How to undo changes?

git reset本地仓库commit回滚

git revert远程仓库commit回滚,but will generate newcommit记录,not erasing revocation records.

如何合并分支？

git merge <branchname> 将HEADBranches and merges pointed to,生成一个新commit.

git rebase <branchname>将HEADThe branch pointed to is differentcommit记录（That is, the difference between the two branchescommit记录）移动到分支上,Into a sequential relationship.

The above can be solved90%问题了,If you have any other problems, please check again.,I can't remember too many commands..

Check out other course exercises,for this course websitegitCheck the warehousecommit记录,Others are more proficient,Check who's words are used when a line is changedgit blame就可以,Others feel that they are not usually used,Remember the above first,I'm not very proficient in rolling back and merging branches.,只会git add .git commit -m ""git psuh XD

调试及性能分析

调试代码

print debug method and log

or add a print statement,either use log.

日志的优势:

您可以将日志写入文件、socket 或者甚至是发送到远端服务器而不仅仅是标准输出;
日志可以支持严重等级（例如 INFO, DEBUG, WARN, ERROR等),这使您可以根据需要过滤日志;
对于新发现的问题,很可能您的日志中已经包含了可以帮助您定位问题的足够的信息.

Logging log color can make more readable,Here's an example that prints colors in the terminalbash脚本

#!/usr/bin/env bash
for R in $(seq 0 20 255); do
    for G in $(seq 0 20 255); do
        for B in $(seq 0 20 255); do
            printf "\e[38;2;${R};${G};${B}m█\e[0m";
        done
    done
done

第三方日志系统

If you are building a large software system,You are likely to use some dependencies,Some rely on will act as a program running alone.如 Web 服务器、Databases or message brokers are such common third-party dependencies.

when interacting with these systems,It is very necessary to read their logs,Because client-side error messages alone may not be enough to locate the problem,Most programs keep logs somewhere on your system.对于 UNIX 系统来说,The log is usually stored in of the program /var/log.例如, NGINX web The server stores its logs in/var/log/nginx.

目前,系统开始使用 system log,All your logs will be kept here.大多数（但不是全部的）Linux 系统都会使用 systemd,这是一个系统守护进程,it controls a lot of things in your system,For example what services should be up and running.systemd The logs are stored in a special format in/var/log/journal,您可以使用 journalctl Command to display the messages.对于大多数的 UNIX 系统,您也可以使用dmesg 命令来读取内核的日志.

不仅如此,大多数的编程语言都支持向系统日志中写日志.

调试器

When printing can no longer meet your debugging needs,you should use a debugger.

A debugger is a program that allows us to interact with an executing program,它可以做到：

当到达某一行时将程序暂停;
一次一条指令地逐步执行程序;
程序崩溃后查看变量的值;
满足特定条件时暂停程序;

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        for j in range(n):
            if arr[j] > arr[j+1]:
                arr[j] = arr[j+1]
                arr[j+1] = arr[j]
    return arr

print(bubble_sort([4, 2, 1, 8, 7, 6]))

Python 的调试器是pdb,下面对pdb 支持的命令进行简单的介绍：

l(ist) - 显示当前行附近的11行或继续执行之前的显示;
s(tep) - 执行当前行,并在第一个可能的地方停止,可以进入函数;
n(ext) - 继续执行直到当前函数的下一条语句或者 return 语句;
b(reak) - 设置断点（基于传入的参数）;
p(rint) - 在当前上下文对表达式求值并打印结果.还有一个命令是pp ,它使用 pprint 打印;
r(eturn) - Continue until the current function is finished running,返回结果;
c(ontinue) - Execute to the next breakpoint or end
q(uit) - 退出调试器.

注意,因为 Python 是一种解释型语言,所以我们可以通过 pdb shell 执行命令. ipdb 是一种增强型的 pdb ,它使用IPython 作为 REPL并开启了 tab 补全、语法高亮、Better backtracking and better introspection,同时还保留了pdb 模块相同的接口.

接下来我们尝试使用pdbto debug this bubble sortpython代码.

首先进入ipdb调试

pdb shell中调用step 也就是输入s,Then you can debug step by step without stopping

After finding that the array is out of bounds,打印j的值看一看

发现j的值是5,那么j+1的值是6,然后由于j的范围是range(n),也就是0-5,当j为5时候arr[j+1]Array out of bounds causes error,所以jThe scope should be changed torange(n-1),quit出pdb,更改j的范围后,Let's run the program to see.

Now the program does not report errors,But obviously the purpose of bubble sort has not been achieved,Now we enter againpdb打断点调试一下.

Something is wrong with where we are sorting now,Obviously the incoming value is4,2,1,8,7,6,But the output all becomes1和6,So the problem is where the swap is,Then we give the6行打个断点,Then continue to execute the code to the breakpoint.

look at the value of the current variable

Stop at a breakpoint after executing a loop,Then look at the value of the array again

所以问题找到了,Through the above operation, you can easily familiarize yourself withpdb工具

Specialized Tools

Even if the program you need to debug is a binary black box program,There are still some tools that can help you.When your program needs to do something that only the operating system kernel can do,它需要使用系统调用.There are commands to help you trace the system calls your program executes.在 Linux 中可以使用strace ,The following example shows how to use strace 或 dtruss 来显示ls 执行时,对stat The system calls to track the results.

静态分析

Some problems you can find without executing the code.例如,仔细观察一段代码,You can find that a loop variable overwrites an existing variable or function name;There is a variable or before being read is not defined. 这种情况下静态分析 tools can help us find the problem.Static analysis takes the source code of a program as input and analyzes it based on coding rules and reasoning about the correctness of the code.

For style checking and code formatting,There are also some tools that can be used as a supplement:：用于 Python 的 black、用于 Go 语言的 gofmt、用于 Rust 的 rustfmt 或是用于 JavaScript, HTML 和 CSS 的 prettier .These tools can automatically format your code,This way the code style can be consistent with the common style. Although you probably don't want to have style control over your code,Standard code style helps others read your code,Also makes it easier for you to read its code.

性能分析

鉴于过早的优化是万恶之源,You need to learn performance analysis and monitoring tools,They will help you find the most time consuming program、The most resource-intensive part,This allows you to perform targeted performance optimizations.

计时

similar to debug code,In most cases, we only need to print the time between two codes to find the problem,但是CPUHandling multiple processes at the same time,This time represents the time when the code was run is not necessarily accurate.

用time 查看一个httpRequested resource consumption

性能分析工具（profilers）

This chapter is rather confusing,Because it doesn't feel like it's working right now

CPU

CPU 性能分析工具有两种：追踪分析器（tracing）及采样分析器（sampling）.追踪分析器会记录程序的每一次函数调用,而采样分析器则只会周期性的监测（通常为每毫秒）您的程序并记录程序堆栈.

Most programming languages have some sort of command line based analyzer,We can use them to analyze the code,They are usually can be integrated in IDE 中.

内存

像 C 或者 C++ 这样的语言,Memory leaks can cause your program after using the memory not to release it.In order to deal with the memory class Bug,我们可以使用类似 Valgrind Such a tool to check for memory leak issues.

对于 Python Such as garbage collection mechanism of language,Memory analyzer is also very useful,As for an object,as long as there are pointers to it,It will not be recycled.

资源监控

通用监控 - 最流行的工具要数 htop,了,它是 top的改进版.htop 可以显示当前运行进程的多种统计信息.htop 有很多选项和快捷键,常见的有：<F6> 进程排序、 t 显示树状结构和 h 打开或折叠线程. 还可以留意一下 glances ,它的实现类似但是用户界面更好.如果需要合并测量全部的进程, dstat 是也是一个非常好用的工具,它可以实时地计算不同子系统资源的度量数据,例如 I/O、网络、 CPU 利用率、上下文切换等等;
I/O 操作 - iotop 可以显示实时 I/O 占用信息而且可以非常方便地检查某个进程是否正在执行大量的磁盘读写操作;
磁盘使用 - df 可以显示每个分区的信息,而 du 则可以显示当前目录下每个文件的磁盘使用情况（ disk usage）.-h 选项可以使命令以对人类（human）更加友好的格式显示数据;ncdu是一个交互性更好的 du ,它可以让您在不同目录下导航、删除文件和文件夹;
内存使用 - free 可以显示系统当前空闲的内存.内存也可以使用 htop 这样的工具来显示;
打开文件 - lsof 可以列出被进程打开的文件信息. 当我们需要查看某个文件是被哪个进程打开的时候,这个命令非常有用;
网络连接和配置 - ss 能帮助我们监控网络包的收发情况以及网络接口的显示信息.ss 常见的一个使用场景是找到端口被进程占用的信息.如果要显示路由、网络设备和接口信息,您可以使用 ip 命令.注意,netstat 和 ifconfig 这两个命令已经被前面那些工具所代替了.
网络使用 - nethogs 和 iftop 是非常好的用于对网络占用进行监控的交互式命令行工具.

Do not watch after class practice,I feel like this chapter stuff is for me,The main thing is to use debugging tools,Performance analysis is not very useful

原网站

版权声明
本文为[ek1ng]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/222/202208102229582763.html

当前位置：网站首页>The Missing Semester of Your CS Education

The Missing Semester of Your CS Education

shell

课程概览与 shell

课程内容

课后练习

Shell 工具和脚本

课程内容

变量

命令替换

进程替换

运行脚本

通配（globbing）

通配符

花括号{}

shebang

shell工具

find

grep

查找 shell 命令

课后练习

vim

vim的设计

操作模式

编程思想

如何使用

插入文本

缓存, 标签页, 窗口

命令行模式

如何移动光标

选择

编辑

The specified for several times

课后练习

Data Wrangling

Data used to organize and related application scenarios

正则表达式

back to data curation

课后练习

Command-line Environment

任务控制

结束进程

暂停和后台执行进程

终端多路复用

别名

配置文件（Dotfiles）

远端设备（ssh）

SSH 密钥

密钥生成

基于密钥的认证机制

通过 SSH 复制文件

利用sshImplement the port that listens on the remote device

本地端口转发

远程端口转发

课后练习

Git

Git 的数据模型

快照

history modeling：关联快照

The data model and its pseudocode representation

Objects and memory addressing

引用

仓库

暂存区

Git常用操作

基础

分支和合并

远端操作

撤销

Git 高级操作

课后练习

How to switch the currently pointed tocommit记录?

How to undo changes?

如何合并分支？

调试及性能分析

调试代码

print debug method and log

第三方日志系统

调试器

Specialized Tools

花括号`{}`