Anne

Anne.github.io


  • Home

  • About

  • Tags

  • Categories

  • Archives

Introduction to Reinforcement learning

Posted on 2018-08-04 | In Machine Learning

强化学习

什么是强化学习

介于监督学习(有label)和无监督学习的中间

  • 没有监督信息,只有“奖赏”的反馈信号
  • 反馈通常是延迟的
  • 样本通常不符合独立同分布假设
  • 智能体的动作会影响到其后续观察到的数据的分布

Example:

  • 直升机的飞行控制
    • +r 沿着既定路线飞行
    • -r 偏离或坠毁
  • 组合投资管理
    • +r 盈利
    • -r 亏损
  • 电站的控制
    • +r 稳定输出电力
    • -r 超出安全预警
  • Atari游戏的控制
    • +r 增加游戏分数
    • -r 减小游戏分数

序列决策问题

Read more »

Interview--Operating System

Posted on 2018-07-28 | In Interview
Read more »

Interview--Computer Network

Posted on 2018-07-28 | In Interview

IMPORTANT POINT

1.计算机网络体系结构

定义:计算机网络的各层+其协议的集合

作用:定义该计算机网络所能完成的功能

Read more »

Deep Learning-Summary

Posted on 2018-07-28 | In Machine Learning

Some important points Summary

Softmax VS Sogmoid

Softmax Sigmoid
公式 $\sigma(z)_j =\frac{e^{z_j}}{\sum_{k=1}^{K}e^{z_k}} $ $S(x)=\frac{1}{1+e^{-x}}$
本质 离散概率分布 非线性映射
任务 多分类 二分类
定义域 某个一维向量 单个数值
值域 [0,1] (0,1)
结果之和 一定为 1 为某个正数

Sigmoid就是极端情况(类别数为2)下的Softmax。

Read more »

Introduction to Algorithms--QuickSort and its randomize

Posted on 2018-07-26 | In Algorithms
  • Divide and conquer
  • Sorts “in place”
  • Very practical (with tuning)

Divide and conquer

  1. Divide (Key): Partition array into 2 subarrays around, pivot x, elements in the lower subarray $\leq​$ x $\leq​$ elements in the upper subarray
  2. Conquer: Recursively sort 2 subarrays
  3. Combine: Trivial

Time complexity: $O(n)$

Read more »

Mathjex的转义问题

Posted on 2018-07-26 | In Coding

Issue

  • 下角标 _ 无效
  • matrix无效

Markdown本身的特殊符号与Latex中的符号会出现冲突

Read more »

Introduction to Algorithms--Sort problems

Posted on 2018-07-07 | In Algorithms

(Pseudocode)

Insert Sorting

1
2
3
4
5
6
7
8
9
// from right to left
INSERTION(A)
for j=2 to A.length
key = A[j]
i=j-1
while i>0 and A[i]>key
A[i+1] = A[i]
i = i-1
A[i+1] = key

$O(n^2)$

Read more »

Jobdu-27-简单计算器

Posted on 2018-06-26 | In Jobdu

Stack

Question

表达式求值

Read more »

Jobdu-22-今年暑假不AC

Posted on 2018-06-25 | In Jobdu

Greedy

Question

Problem Description
“今年暑假不AC?”
“是的。”
“那你干什么呢?”
“看世界杯呀,笨蛋!”
“@#$%^&*%…”

确实如此,世界杯来了,球迷的节日也来了,估计很多ACMer也会抛开电脑,奔向电视了。
作为球迷,一定想看尽量多的完整的比赛,当然,作为新时代的好青年,你一定还会看一些其它的节目,比如新闻联播(永远不要忘记关心国家大事)、非常6+7、超级女生,以及王小丫的《开心辞典》等等,假设你已经知道了所有你喜欢看的电视节目的转播时间表,你会合理安排吗?(目标是能看尽量多的完整节目)

Input
输入数据包含多个测试实例,每个测试实例的第一行只有一个整数n(n<=100),表示你喜欢看的节目的总数,然后是n行数据,每行包括两个数据Ti_s,Ti_e (1<=i<=n),分别表示第i个节目的开始和结束时间,为了简化问题,每个时间都用一个正整数表示。n=0表示输入结束,不做处理。

Output
对于每个测试实例,输出能完整看到的电视节目的个数,每个测试实例的输出占一行。

Sample Input

1
2
3
4
5
6
7
8
9
10
11
12
13
14
12
1 3
3 4
0 7
3 8
15 19
15 20
10 15
8 18
6 12
5 10
4 14
2 9
0

Sample Output

1
5
Read more »

Jobdu-21-FatMouse's house

Posted on 2018-06-25 | In Jobdu

Question

题目描述:
FatMouse prepared M pounds of cat food, ready to trade with the cats guarding the warehouse containing his favorite food, JavaBean.The warehouse has N rooms. The i-th room contains J[i] pounds of JavaBeans and requires F[i] pounds of cat food. FatMouse does not have to trade for all the JavaBeans in the room, instead, he may get J[i] a% pounds of JavaBeans if he pays F[i] a% pounds of cat food. Here a is a real number. Now he is assigning this homework to you: tell him the maximum amount of JavaBeans he can obtain.

输入:
The input consists of multiple test cases. Each test case begins with a line containing two non-negative integers M and N. Then N lines follow, each contains two non-negative integers J[i] and F[i] respectively. The last test case is followed by two -1’s. All integers are not greater than 1000.

输出:
For each test case, print in a single line a real number accurate up to 3 decimal places, which is the maximum amount of JavaBeans that FatMouse can obtain.

样例输入:
5 3
7 2
4 3
5 2
20 3
25 18
24 15
15 10
-1 -1

样例输出:
13.333
31.500

Read more »
1…456…14
Anne_ZAJ

Anne_ZAJ

boom pow

134 posts
14 categories
17 tags
0%
© 2019 Anne_ZAJ
Powered by Hexo
|
Theme — NexT.Pisces v5.1.3